{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 朴素贝叶斯算法原理与搜狗新闻分类实战"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 贝叶斯公式\n",
    "贝叶斯公式就一行：\n",
    "\n",
    "$$P(Y|X)=P(X|Y)P(Y)P(X)$$\n",
    "而它其实是由以下的联合概率公式推导出来：\n",
    "\n",
    "$$P(Y,X)=P(Y|X)P(X)=P(X|Y)P(Y)$$\n",
    "其中$P(Y)$叫做先验概率， $P(Y|X)$叫做后验概率，$P(Y,X)$叫做联合概率。\n",
    "\n",
    "没了，贝叶斯最核心的公式就这么些。\n",
    "\n",
    "## 机器学习的视角理解贝叶斯公式\n",
    "在机器学习的视角下，我们把 X 理解成“具有某特征”，把 Y 理解成“类别标签”(一般机器学习为题中都是X=>特征, Y=>结果对吧)。在最简单的二分类问题(是与否判定)下，我们将 Y 理解成“属于某类”的标签。于是贝叶斯公式就变形成了下面的样子:\n",
    "\n",
    "$$P(“属于某类”|“具有某特征”)=P(“具有某特征”|“属于某类”)P(“属于某类”)P(“具有某特征”)$$ \n",
    "我们简化解释一下上述公式：\n",
    "\n",
    "- $P(“属于某类”|“具有某特征”)=$在已知某样本“具有某特征”的条件下，该样本“属于某类”的概率。所以叫做『后验概率』。\n",
    "- $P(“具有某特征”|“属于某类”)$= 在已知某样本“属于某类”的条件下，该样本“具有某特征”的概率。 \n",
    "- $P(“属于某类”)$= （在未知某样本具有该“具有某特征”的条件下，）该样本“属于某类”的概率。所以叫做『先验概率』。\n",
    "- $P(“具有某特征”)$= (在未知某样本“属于某类”的条件下，)该样本“具有某特征”的概率。\n",
    "\n",
    "而我们二分类问题的最终目的就是要判断$P(“属于某类”|“具有某特征”)$是否大于1/2就够了。贝叶斯方法把计算**\"具有某特征的条件下属于某类\"**的概率转换成需要计算“属于某类的条件下具有某特征”的概率，而后者获取方法就简单多了，我们只需要找到一些包含已知特征标签的样本，即可进行训练。而样本的类别标签都是明确的，所以贝叶斯方法在机器学习里属于有监督学习方法。\n",
    "\n",
    "这里再补充一下，一般『先验概率』、『后验概率』是相对出现的，比如 P(Y) 与 P(Y|X) 是关于 Y 的先验概率与后验概率， P(X) 与 P(X|Y) 是关于 X 的先验概率与后验概率。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 搜狗新闻主题分类\n",
    "- 这是一个文本分类问题，经典的新闻主题分类，下面用朴素贝叶斯来做\n",
    "- 数据集可百度网盘下载：链接:https://pan.baidu.com/s/14yMZNWrrgO7FVlGw4vLS3A  密码:wg90"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import time\n",
    "import random\n",
    "import jieba  # 处理中文\n",
    "import nltk   # 处理英文\n",
    "import sklearn\n",
    "from sklearn.naive_bayes import MultinomialNB\n",
    "import numpy as np\n",
    "import pylab as pl\n",
    "import matplotlib.pyplot as plt\n",
    "from collections import Counter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def text_processing(folder_path, test_rate=0.2):\n",
    "    data_list = []\n",
    "    label_list = []\n",
    "\n",
    "    folder_list = os.listdir(folder_path)\n",
    "    for folder in folder_list:\n",
    "        text_folder_path = os.path.join(folder_path, folder)\n",
    "        text_files = os.listdir(text_folder_path)\n",
    "\n",
    "        # 读取每个文件\n",
    "        n = 1\n",
    "        for file in text_files:\n",
    "            if n > 100:\n",
    "                # 怕内存爆掉，只取100个样本文件，后期可以注释掉\n",
    "                print(\"n>100\")\n",
    "                break\n",
    "            with open(os.path.join(text_folder_path, file), \"r\") as f:\n",
    "                text = f.read()\n",
    "                # read() 返回值为str，每次读取整个文件，将文件所有内容放到一个字符串变量中\n",
    "                # readline() 返回值为str，每次只读取一行,每行的内容放在一个字符串变量中\n",
    "                # readlines() 返回值为list，一次读取整个文件，每行的内容放在一个字符串变量中作为列表的一个元素。\n",
    "\n",
    "            # 使用jieba分词\n",
    "            # 开启并行分词,参数为并行进程数\n",
    "            jieba.enable_parallel()\n",
    "            word_cut = jieba.cut(text, cut_all=False) # 精确模式，返回的结构是一个可迭代的genertor\n",
    "            word_list = list(word_cut)\n",
    "            jieba.disable_parallel() # 关闭并行分词模式\n",
    "\n",
    "            data_list.append(word_list) # 训练集list\n",
    "            label_list.append(folder)  # 训练集标签分类\n",
    "            n += 1\n",
    "\n",
    "    # 划分数据集和测试集\n",
    "    data_label_list = list(zip(data_list, label_list))\n",
    "    random.shuffle(data_label_list)\n",
    "\n",
    "    idx = int(len(data_label_list)*test_rate)+1\n",
    "    print(\"总样本数：\", len(data_label_list))\n",
    "    train_list = data_label_list[idx:]\n",
    "    test_list = data_label_list[:idx]\n",
    "    # print(train_list)\n",
    "    \n",
    "    # 这里返回包含一组列表的元祖（[]）\n",
    "    train_data_li, train_label_li = zip(*train_list)\n",
    "    test_data_li, test_label_li = zip(*test_list)\n",
    "\n",
    "    # 统计词频，得到词频逆序字典?为什么不用总的样本来统计词频而是用训练集\n",
    "    vocab_dict = dict(Counter([w for li in train_data_li for w in li]))\n",
    "    vocab_list = sorted(vocab_dict.items(), key=lambda f: f[1], reverse=True)\n",
    "\n",
    "    vocab_list, _ = zip(*vocab_list)\n",
    "    vocab_list = list(vocab_list)\n",
    "    return vocab_list, train_data_li, train_label_li, test_data_li, test_label_li"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Building prefix dict from the default dictionary ...\n",
      "Loading model from cache /tmp/jieba.cache\n",
      "Loading model cost 1.009 seconds.\n",
      "Prefix dict has been built succesfully.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "总样本数： 90\n",
      "词汇个数： 9875\n",
      "训练集样本个数： 71\n",
      "测试集样本个数： 19\n",
      "训练集标签： 71\n",
      "测试集标签： 19\n"
     ]
    }
   ],
   "source": [
    "random.seed(2019)\n",
    "folder_path = \"Database/SogouC/Sample\"\n",
    "vocab_list, train_data_list, train_label_list, test_data_list, test_label_list = text_processing(folder_path, test_rate=0.2)\n",
    "print(\"词汇个数：\", len(vocab_list))   \n",
    "print(\"训练集样本个数：\", len(train_data_list)) \n",
    "print(\"测试集样本个数：\", len(test_data_list)) \n",
    "print(\"训练集标签：\", len(train_label_list)) \n",
    "print(\"测试集标签：\", len(test_label_list)) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 停用词去重\n",
    "- 清洗停用词"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 词去重\n",
    "def make_word_set(words_file):\n",
    "    words_set = set()\n",
    "    with open(words_file, \"r\") as f:\n",
    "        for line in f:\n",
    "            word = line.strip()\n",
    "            if len(word) > 0 and word not in words_set:\n",
    "                words_set.add(word)\n",
    "    return words_set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "停用词个数： 428\n",
      "stopwords_set: {'甚至于', '仍旧', '并不', '此次', '全体', '所', '依照', '沿着', '嘛', '否则', '才是', '何处', '万一', '可以', '曾', '但', '且', '您', '那样', '嗡', '或是', '下', '鉴于', '而外', '还是', '另外', '吧', '既然', '说来', '尔', '当地', '这会', '被', '他们', '然而', '本地', '那时', '甚至', '一', '或者说', '或', '受到', '别处', '了', '哟', '诸', '不论', '哪些', '其余', '凡', '只消', '上', '彼此', '哪', '这', '即', '何况', '进而', '果然', '由于', '不尽', '因而', '不单', '距', '之', '多会', '与', '及至', '就是说', '则', '与否', '这里', '例如', '此外', '多么', '朝着', '只有', '简言之', '来自', '这么', '随', '由此', '这边', '后者', '所在', '往', '哪怕', '此处', '既往', '唯有', '不只', '仍', '比如', '以来', '许多', '致', '就是', '是', '若是', '去', '又及', '趁', '又', '同', '除外', '个人', '至今', '宁可', '它们', '由', '使', '得', '在于', '此', '向着', '我', '而已', '个', '全部', '如上', '不然', '比', '两者', '跟', '另', '你', '个别', '自己', '为了', '此间', '那儿', '介于', '诸位', '无', '就算', '一切', '嘻嘻', '有的', '对比', '接着', '倘若', '呵呵', '一些', '只', '一旦', '儿', '前者', '那些', '本着', '譬如', '只限于', '再有', '啦', '而是', '出来', '除非', '怎么办', '所有', '连带', '别的', '不仅', '何以', '自从', '或者', '从', '某些', '当然', '虽然', '正巧', '即便', '至', '即使', '让', '什么', '小', '于', '继而', '还要', '关于', '么', '经过', '和', '那里', '虽说', '怎样', '以免', '本人', '值此', '按照', '截至', '为此', '沿', '据此', '另一方面', '可见', '其', '连同', '我们', '各位', '与其', '也', '而后', '既', '不料', '况且', '来说', '某某', '嘿嘿', '格里斯', '故而', '尽管如此', '人', '很', '替代', '什么的', '假如', '及', '以至', '再', '一来', '不是', '正是', '较之', '这般', '从而', '其它', '给', '对待', '怎么样', '今', '起', '自', '既是', '这儿', '来', '自身', '啥', '因之', '根据', '每', '看', '只需', '并且', '正值', '只限', '如是', '基于', '到', '她们', '各', '为', '以', '着', '遵照', '对方', '以为', '只因', '不管', '最', '首先', '并非', '除了', '他', '任何', '依据', '随着', '总之', '同时', '如果说', '们', '某个', '凭借', '不如', '于是', '用来', '拿', '那么', '不外乎', '这些', '彼时', '几', '以及', '据', '那', '打', '要不', '处在', '那边', '等等', '再则', '从此', '哪个', '随后', '在', '可是', '你们', '除此', '反而', '的', '遵循', '别', '至于', '别人', '的确', '为止', '作为', '靠', '随时', '人们', '尽管', '已', '不仅仅', '怎', '如同下', '只怕', '有时', '就要', '后', '本身', '其次', '此地', '咱们', '以上', '这个', '对于', '如若', '为什么', '这样', '才能', '有些', '各自', '因', '要不然', '因为', '那个', '咱', '此时', '凭', '何', '该', '的话', '加之', '便于', '哪儿', '似的', '其中', '诸如', '若非', '其他', '多少', '不', '它', '直到', '大家', '谁人', '得了', '为何', '非但', '别说', '以致', '她', '如此', '分别', '甚而', '正如', '用', '每当', '可', '要么', '虽', '开外', '固然', '加以', '但是', '若', '却', '还有', '而且', '何时', '他人', '不过', '要是', '逐步', '以便', '如何', '然后', '为着', '不光', '如', '有关', '乃至', '某', '趁着', '只要', '只是', '凡是', '哇', '反之', '针对', '照着', '无论', '那般', '不但', '向', '乃', '还', '把', '并', '因此', '亦', '有', '当', '什么样', '如果', '谁', '出于', '如下', '而', '怎么', '赖以', '些', '不至于', '所以', '毋宁', '不尽然', '之所以', '光是', '好'}\n"
     ]
    }
   ],
   "source": [
    "stopwords_file = \"./stopwords_cn.txt\"\n",
    "stopwords_set = make_word_set(stopwords_file)  # 去重后的停用词\n",
    "print(\"停用词个数：\", len(stopwords_set))\n",
    "print(\"stopwords_set:\", stopwords_set)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 从词袋中选取有代表的特征词\n",
    "- 第一步生成的词袋里有很多通用的、无意义的词语，需要去掉。  \n",
    "- 有代表性的词语很大概率是一些对最终类别区分有作用的词语。并且后面这些词语会作为特征作为模型的输入。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def vocab_select(vocab_list, deleteN, stopwords_set=set()):\n",
    "    # 选取特征词\n",
    "    feature_words = []\n",
    "    n = 1\n",
    "    # 从deleteN 开始，舍弃前从deleteN个单词\n",
    "    # 因为越前面的词词频越高，在所有类别中都可能出现很多次\n",
    "    for t in range(deleteN, len(vocab_list), 1):\n",
    "        if n > 1000:\n",
    "            # 选取1000个词汇，也就是特征词是1000\n",
    "            break\n",
    "        # 满足三个条件：不是数字；不在停用词表；长度2～4就添加到特征词列表\n",
    "        if not vocab_list[t].isdigit() and \\\n",
    "            vocab_list[t] not in stopwords_set and \\\n",
    "            1<len(vocab_list[t])<5:\n",
    "                feature_words.append(vocab_list[t])\n",
    "                n += 1\n",
    "    return feature_words"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['公司', '一个', '游客', '旅游', '导弹', '考生', '大陆', '认为', '火炮', '台军', '进行', '时间', '一种', '解放军', '各种', '美国', '没有', '北京', '市场', '作战', '支付', '志愿', '成为', '已经', '仿制', '发展', '复习', '远程', '工作', '很多', '建设', '主要', '可能', '目前', '通过', '企业', '五一', '问题', '品牌', '学习', '黄金周', '射程', '银行', '技术', '一定', '部分', '基础', '增长', '部署', '分析', '上海', '亿美元', '学校', '考试', '词汇', '选择', '辅导班', '期间', '完全', '能力', '记者', '文章', '时候', '表示', '训练', '专业', '毕业生', '部队', '需要', '重要', '专家', '收入', '提高', '填报', '今年', '军事', '阵地', '计划', '必须', '达到', '坦克', '影响', '用户', '电话', '管理', '科学', '开始', '拥有', '表现', '资料', '万人次', '几乎', '来源', '发现', '相关', '准备', '服务', '提供', '要求', '销售']\n"
     ]
    }
   ],
   "source": [
    "deleteN = 20 \n",
    "# 删除前词频高的top20个词语,可以调整这个数值\n",
    "feature_words = vocab_select(vocab_list, deleteN, stopwords_set)\n",
    "print(feature_words[:100])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 训练和测试集生成固定长度的词向量特征\n",
    "- 这步为后面数据输入进贝叶斯模型训练做准备。\n",
    "- 因为文本长度不一，所以每个样本需要固定好维度，才能喂给模型训练。\n",
    "- nltk 与sklearn中的训练数据类型不一样，应该分别处理"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def text_features(train_data_list, test_data_list, feature_words, flag=\"nltk\"):\n",
    "    def text_features(text, feature_words):\n",
    "        text_words = set(text) # 样本去重\n",
    "        if flag == \"nltk\":\n",
    "            # 遍历每个样本词语，凡是样本的词语出现在1000个特征词里，就记录下来，\n",
    "            # 由于nltk特征 dict, 需要将特征保存为字典格式，键为词语，值为1，否则值为0。\n",
    "            features = {word:1 if word in text_words else 0 for word in feature_words}\n",
    "        elif flag == \"sklearn\":\n",
    "            # sklearn输入是列表\n",
    "            # 遍历每个样本词语，出现即为1，不出现为0，返回一个列表\n",
    "            features = [1 if word in text_words else 0 for word in feature_words]\n",
    "        else:\n",
    "            features = []\n",
    "        return features\n",
    "    # 训练样本 二维列表 \n",
    "    train_feature_list = [text_features(text, feature_words) for text in train_data_list]\n",
    "    # 测试样本 二维列表\n",
    "    test_feature_list = [text_features(text, feature_words) for text in test_data_list]        \n",
    "    return train_feature_list, test_feature_list              "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "训练样本个数： 71\n",
      "测试样本个数： 19\n",
      "样本特征维度： 1000\n",
      "测试集的第6个样本的前100个值： [1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]\n"
     ]
    }
   ],
   "source": [
    "flag = 'sklearn'\n",
    "train_feature_list, test_feature_list = \\\n",
    "    text_features(train_data_list, test_data_list, feature_words, flag)\n",
    "print(\"训练样本个数：\", len(train_feature_list)) \n",
    "print(\"测试样本个数：\", len(test_feature_list))   \n",
    "print(\"样本特征维度：\", len(test_feature_list[5]))  \n",
    "print(\"测试集的第6个样本的前100个值：\",test_feature_list[5][0:100]) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 用贝叶斯模型进行训练和预测"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 分类，输出准确率\n",
    "def text_classifier(train_feature_list, test_feature_list, \n",
    "                    train_label_list, test_label_list, flag=\"nltk\"):\n",
    "    if flag == 'nltk':\n",
    "        ## 使用nltk分类器\n",
    "        train_flist = zip(train_feature_list, train_label_list)\n",
    "        train_flist = list(train_flist) \n",
    "        test_flist = zip(test_feature_list, test_label_list)\n",
    "        train_flist = list(test_flist) \n",
    "        classifier = nltk.classify.NaiveBayesClassifier.train(train_flist)\n",
    "        test_accuracy = nltk.classify.accuracy(classifier, test_flist)\n",
    "        \n",
    "    elif flag == 'sklearn':\n",
    "        ## sklearn分类器\n",
    "        classifier = MultinomialNB().fit(train_feature_list, train_label_list)\n",
    "        # MultinomialNB()的使用方法和参数见：https://www.cnblogs.com/pinard/p/6074222.html\n",
    "        # https://scikit-learn.org/stable/modules/naive_bayes.html#complement-naive-bayes\n",
    "        test_accuracy = classifier.score(test_feature_list, test_label_list)\n",
    "    else:\n",
    "        test_accuracy = []\n",
    "    return test_accuracy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "测试准确率： 0.7368421052631579\n"
     ]
    }
   ],
   "source": [
    "flag='sklearn'\n",
    "test_accuracy = text_classifier(train_feature_list, test_feature_list, \n",
    "                                    train_label_list, test_label_list, flag)\n",
    "print(\"测试准确率：\", test_accuracy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 调参与可视化\n",
    "这步调参，查看不同的deleteNs对模型效果的影响"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "start\n",
      "总样本数： 90\n",
      "[0.6842105263157895, 0.6842105263157895, 0.6842105263157895, 0.6842105263157895, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.6842105263157895, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.6842105263157895, 0.7368421052631579, 0.7368421052631579, 0.6842105263157895, 0.7368421052631579, 0.6842105263157895, 0.631578947368421, 0.6842105263157895, 0.6842105263157895, 0.5263157894736842, 0.5789473684210527, 0.5789473684210527, 0.631578947368421, 0.5789473684210527, 0.5789473684210527, 0.5789473684210527, 0.5789473684210527, 0.631578947368421, 0.631578947368421, 0.5789473684210527, 0.5789473684210527, 0.5789473684210527, 0.5789473684210527, 0.5789473684210527, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.631578947368421, 0.5789473684210527, 0.5789473684210527, 0.5263157894736842, 0.5263157894736842]\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEWCAYAAAB1xKBvAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3Xl8JHWd+P/XO30k3ZnMpDMHzEwHhhtFv4DOT0D57noCsl9Fdz3AXRVFXFf96R76FdddUfT7XXVXxd1FxQPPRVR0lVUUFa8VBRncERcEGQ5Jz8FkJp2ZSbpzv79/VFWn0umjOl3VnXTez8cjj6Srqqs+Vd2pd31uUVWMMcaYerranQBjjDErgwUMY4wxgVjAMMYYE4gFDGOMMYFYwDDGGBOIBQxjjDGBWMBY5kTk6SKSa+L9HxeRvw8zTRWOoSJyYpV1fyoi34vouH8hIo+JyJiIrA+w/SMi8uwA221zzykeTkpXBhH5sYi8pt3pMMuXBYwWcG9URffGtk9EPisiayI4zqUi8jP/MlV9naq+J+xjBaWq/6aq54W9XxFJAB8CzlPVNap6MOxjBEzHomteZ/unu8HomrLlPxORS0NPYEhE5F0i8sWQ9lX1AcMsbxYwWud5qroGOAM4E3h7m9Oz0h0F9AD3tDshSzAOvEJEtrU5HaaG1ZbDDMICRoup6j7gFpzAAYCIdIvIP4nIo24Ry8dFJFXp/SJyhYg8KCJHROReEXmhu/xxwMeBc9yczKi7/LMi8l7f+y8XkV0iMiIiN4nIFt86FZHXicgDIpIXkWtERNx1J4rIT0TkkIgcEJEvlyXt2VXet+AJ3D3Gm0TkIXc//ygiFb+H7nW5WkT2uD9Xu8tOBu53NxsVkR9Wef/LReT3InJQRN5Rtq7Ldy0PishXRGSgyn7WicinRWSviOwWkfeKSKzGNa/3eY4CnwWurHK8etfav+1X3VzrIRH5qYic5lv3Wfez+Lb7fblDRE7wrX+OiNznvvdfAalyjAuAvwVe6p7nr2tdl1rnICI/dXf7a3dfL61xbhkR+ZaIDLvfq2+JSNa3fkBEPuN+N/Ii8g3fuotEZKeIHHY/4wvc5QuKJcWXc5L5osjLRORR4IcBrnFKRD7ofs8OiZNTTLnX/P8vO5+7ReQF1c53RVBV+4n4B3gEeLb7dxb4DfAR3/qrgZuAAaAP+A/gH9x1Twdyvm1fDGzBCfYvxXla3eyuuxT4WdmxPwu81/37mcAB4ElAN/AvwE992yrwLaAfOAYYBi5w130JeId73B7g3IDvW5Amd9sfued6DPA74DVVrttVwO3AJmAj8HPgPe66be6+4lXe+3hgDPgD91w/BMz4Poe/dPeddddfC3yp0r6Bb7jre920/BL48xrXvO7nCRwNHAZOcZf/DLi03rWucJ6vdo/R7R53Z9lnPwI8BYgD/wbc4K7b4B7/RUAC+Cv3+lT7LN4FfLFsWa3rUu/7cmKA/5v1wJ8Aafccvwp8w7f+28CXgYx7Dn/oLn8KcAh4jnv8rcCp5f+L5efl+9w/755TKsA1vgb4sXuMGPBUd7uXAHf4tjsdOAgk230/aupe1u4ErIYf90s6Bhxxv5C3Av3uOsG56Z/g2/4c4GH376fjCxgV9r0TuMj9+1JqB4xPAx/wrVsDTAPb3Nda9o/9FeAK9+/PA58AshXSUOt9C9LkbnuB7/XrgVurnNuDwIW+1+cDj7h/e//c1QLGO3Fvju7rXmCK+YDxW+BZvvWb3WsR9+8bp+hr0rt5uNteAvyoyvkF/jyBDwBfdv/2B4yq17rO96zfTfc632f/Kd/6C4H73L9fAdxelu4cAQNGgOtS7/tSN2BUeN8ZQN73ec0BmQrbXQt8uMb/Yr2AcXyQa4wTjIrA6RW268YJ1ie5r/8J+Gij57zcfqxIqnVeoKp9ODeMU3Ge8MB5ck4Dd4nIqFus8V13+SIi8go3q+1t+wTfvurZAvzee6GqYzhPPVt92+zz/V3ACSoA/xvnpvJLEblHRF5dtu9q76tkyPf379101U1vnW0rvbd0HFUdxzlXz7HAv/uu42+BWZwbIWXbJYC9vm2vxXmirqSRz/P9wPkicnrZ8nrXGgC3WOx9bpHLYZybISz8PlT7XMqvj7Lwc6mn3nUJdA61iEhaRK51i3sOAz8F+t1ir0FgRFXzFd46iPOwsVSl61DnGm/AyT0tOpaqTuI8OP2ZOEWulwBfaCJNy4JV6rSYqv5ERD6L88TxApwioiJwmqrurvVeETkW+CTwLOAXqjorIjuZL3uuN/TwHpx/dG9/vTjZ/prHddO9D7jcfd+5wA9E5KequqveeysYZL6y+hg3XbXSG2TbcnuBx3kvRCSNc66eIeDVqnpb+RtlYWX0EM6T9AZVnalwnPJrHvjzVNWDInI18J6y5UGv9cuAi4Bn49zI1gF5qtRFlNmL8zngHkf8ryslt+x1zesS0vflb4BTgLNUdZ+InAH8F875DQEDItKvqqMV0nYClY3jBHTP0RW28Z9rrWt8AJhwj/XrCvv5HE6Q+BlQUNVfVEnTimE5jPa4GniOiJyhqnM4QeDDIrIJQES2isj5Fd7Xi/NlHna3exVODsPzGJAVkWSV414PvEpEzhCRbuD/4pSzPlIvwSLyYl+FY95Nx2y991XxVrdCcxB4M045dCVfAv5ORDaKyAacYqagTTtvBP6XiJzrXo+rWPh9/zjwf9wgjHuMi8p3oqp7ge8BHxSRteJUlp8gIn/obrLgmjf4eYJTt/JUFga3oNe6D+emfRDnJvh/a1+SBb4NnCYifyxOa6A3Ufnm6XkM2OY+Lde9LnXO4THg+ABp7MMJvqPiNEi40lvhHv87wEfd71JCRP7AXf1pnO/5s9x0bRWRU911O4GL3e2349Th1EtDxWvsftbXAR8SkS1ubuQc938LN0DMAR+kA3IXYAGjLVR1GKeM1+tQ9zZgF3C7m+39Ac6TVfn77sX58v0C55/uiYD/CfmHOE/j+0TkQIX33+oe82s4T5gnABcHTPb/B9whImM4FbpvVtWHA7633DeBu3D+eb+N8w9eyXuBHcDdOA0FfuUuq0tV7wHegBMk9+LctPwdID+Ccx7fE5EjOBXgZ1XZ3SuAJHCvu58bccrQofI1D/R5uuk8jFOX4W+hFfRafx6nmG63m7bbq6S/0nEP4DSgeB/OzfAkFn6Xyn3V/X1QRH7l/l3rutQ6h3cBn3OLsl5S45hXAymcJ/nbcYr2/F6OU+90H7AfpyEDqvpL4FXAh3Eqv3/CfM7673G+93ng3Tjfj1rqXeO34Hw378Sps3g/C++rn8f5Pw2lD0u7iVshY0xLiIjiVAQupSjLmBVFRF4BvFZVz213WsJgOQxjjImAW2/2epzWYh3BAoYxpm1E5G/F6cBX/vOddqetGW6d1TBO0XG9Yq8Vw4qkjDHGBGI5DGOMMYF0VD+MDRs26LZt29qdDGOMWVHuuuuuA6pasbOwX0cFjG3btrFjx452J8MYY1YUEfl9/a2sSMoYY0xAFjCMMcYEYgHDGGNMIBYwjDHGBGIBwxhjTCAWMIwxxgRiAcMYY0wgHdUPYzn76e+G2fHISCj76k7EuPSp2+jtDvbx3f7QQX6+a9Fo50uSjHfx8rO3sS6dCLT9rx7N8+P79ldcd9zGXl54Zrbiukq+uXM3f3jyRvrT1ab7MMZEyQJGi7zjG79haKSIBJkLrQZv6K9jBtI87/Rgs5Ve9R/3cu/ew6Ede6C3m5eddUyg97z/O/dxx8Mji47t7euC0zaTSsbq7mf3aJE337CTv73wVF77B9UmUzPGRMkCRgvMzM6xZ3SCNz7jRN5yfsV5dAIbn5zhtCtvYShfCPyeoXyBV55zLO++6An1N65hdk455e++Q66BY+fyRf74SVv50EvOWLD8mzt38+YbdrJ7tMCJm/rq7mdopOD+LjaWaGNMaKwOowX2HZ5gdk7JZlJN76u3O85Ab5JcPtiN81BxmiMTM2Qz6fob1xHrErb0pwIfe3p2jr2HihWP7V2LoYD78o7ZSLAyxoTLAkYLeDe7MG7azn6C37S9G2wYwWr+2MFu2vsOTTCnlY/tXYtGzyPo9saY8FnAaAGvOCXUm/ZIsJu2V4QTZrAKmiuodd4b13STjHU1fB65fBGbw8WY9rCA0QK5vFPZvbm/J5T9ZTNpcqNF5ubq3zjDz2GkGT4yycT0bIBjOzf5wQrBqqtL2LqEnFJxepaD41MNpNgYExYLGC2Qyxc5em0P3fH6rYGCGMykmJqZ48DYZKBjr+mO0x+wGWzdYw84gWf3aP0bfS5fINYlbF5XOVA2UryVyxdZ2xMv/W2MaT0LGC2QyxdCe8KH+eKlIEVDuXyRbCaFNNumtuzYQW7aXqCMxyp/zbKZdKD9zMzOse/wBGcdv97dr1V8G9MOFjBawLlph1OHAPPFS0FunOEHq0aOXax57GwmxcHxKQpTMzX3s/eQ08rs7FLAsByGMe1gASNi3tNxmDftraWbdu0bp6qGHqw29fWQiEmg/hBD+ULNY3vXZHed8/D6nJx6dB+ZdMJyGMa0iQWMiHlPx2EGjHQyzvoAfTEOFacZm5wJ9djzfTFq37SnZuoHyqDFW/PNklNkM2nrvGdMm1jAiJj3dFyppVAzsgPpujftsPt/eAYD1D3sPVREFQYHqh/bq0Cv12u91MpsXaqhinJjTLgsYEQsqpt2kM57YTepbezY87mCajau6aY73hXoPI5e20My3lU6tvXFMKb1LGBELJcv0iVwdJWmpUuVzaTYna/dF6NWP4hmj31grHZfjCDBSsTri1E/h+GdQzaTZnJmjgNj1hfDmFazgBEx/9NxmLKZNFOzcwzX6IuRyxfp646zNhXuGJNB6h5y+SKxLuHotbUDZZCmtbt9ra28YiwrljKm9SxgRCw3Em4rJU+Q5q1DIwW2htgHo/zYteoehkYKbF5XvQ+Gf1+1Asb8AIYpd/vgfVCMMeGygBGxsPtBeAYDNK0Nu0mtJ2gOI8h5ZzMpRsanGJ+s3Bdj76g3gKFzzK39lsMwpl0sYESo1LS0RkuhpSo9aVcZvM/pg1EoFeGEaVOfO3BgjZu2v96hlsE6wae8LqTR4d2NMeGxgBGhWsN7N6snEWPDmu6qN87RwjTjU7OR5DDqDRw4OTPLY0cmAh27XtFapVZmjQzvbowJjwWMCEXVrNVT68YZpFlrVMfeOzqBBgyU9Yq3cvkCXWUj/VpfDGPawwJGhKJq1uqpdeMcakWwqlIc1sixN6xJun0xqu2ryOZ1KRK+yvPBTJrd1hfDmJazgBGhIffpOOw+GJ5sJs3uKvNizOduogpW6aoDB5ZyNwHqbkSkTk6pUBo7a/7YKSZn5hg+Un94d2NMeCxgRChX4ek4TNlMiulZZX+FG2cuX6SvJ866VDjzYFQ6NlQeODCXLxAP0Adjfl/V+2JUam1lTWuNaQ8LGBGKqkmtxxunqVJ/iKCtlJaqVt1DLl9kS3+KWFew/h+DA6mK5zA/gOHC82hkiHVjTHgsYEQoqn4Qnlo3zsiDVc1jB+uD4clm0owWpjkyMb1geWkAw7J9BR3e3RgTLgsYEQkyvHezSp3Yyob7VlWGIuph7tmwpptkvKtisdDQSGPBqlS8VTbtqzeMefl5BB3e3RgTLgsYEfGejqMMGD2JGBv7FvfFGBmfojg9G+mxu7qEbIV5MSamZ9l/ZLKhYFUq3ioLfLWaJVvTWmNazwJGRKo9HYctm0mRG11444y6D4anUue9PaONH7ta0Zo3gOHmCq3MnPlALIdhTCtZwIiId/OLYmgOv0qTGZX6f0QwJMmCY1e4aS/l2Ot7k6QSsQr7ckb6rTSAYZDh3Y0x4bKAEZGgw3s3K5tJsWe0yKzvxukFq/L+C1Ecu3zgwKXkbry+GOUtpWpVngcZ3t0YEy4LGBHJ5YMN792sbCbN9Kzy2OEJ37GLrEslWNsTTR8M/7FhYWV1Ll8gERM29TUWKCt13svli1VzKta01pjWs4ARkUabli5VtkIT06GIm9SWH9s/Yu5Qg30w5ve1sHhrfgDDyucRZHh3Y0y4Ig0YInKBiNwvIrtE5IoK6z8sIjvdn9+JyKhv3StF5AH355VRpjMKzk072joEqPyk3c5gtdT+H9lMikPFaQ67fTH2lAYwrHwNt/bXn5PDGBOuyAKGiMSAa4DnAo8HLhGRx/u3UdW/UtUzVPUM4F+Ar7vvHQCuBM4CngJcKSKZqNIatsmZWR47PNmSm/aW/oU3bW8ejFYEq41ruhcNHJjLF8n2N37sUvGWex71RvpNJZ3h3avNB2KMCV+UOYynALtU9SFVnQJuAC6qsf0lwJfcv88Hvq+qI6qaB74PXBBhWkO1Z9SpT4hyaA5PTyLGUWu7SzfYg+NTTEzPLeodHYXygQMnpmcZPjK5pJZh83N1Fxf8rhV0bV4MY1oryoCxFRjyvc65yxYRkWOB44AfLuG9rxWRHSKyY3h4uOlEhyHqeTDKZTPpUr+PShMORX1s75i7R5d+7PIZBIMMYGid94xprSgDRqVaz2qN5i8GblTV2Ubfq6qfUNXtqrp948aNS0hm+BoZ3jsM/s573g03G3H/D/+xveawpWMvIVBm0gnSyfm+GEMjRTb3125lVmt4d2NM+KIMGDlg0Pc6C+ypsu3FzBdHNfreZcd7Oj6qr7slx8tmUuwdnWBmdq50w/XGmYr+2PMDBzaTu5kv3prPYdSrC6k1vLsxJnxRBow7gZNE5DgRSeIEhZvKNxKRU4AM8Avf4luA80Qk41Z2n+cuWxGCPB2HKZtJMzOnPHZkkly+QH86QV/EfTDmjz0/cGAuX3T7YCwtUPqLt4K09Co167ViKWNaIrI7mqrOAG/EudH/FviKqt4jIleJyPN9m14C3KC++TZVdQR4D07QuRO4yl22IgR5Og5TqXnrSKFlTWoXH7vozI7Xn6KrwT4Y/n3l8oXAAxh6nfqsHsOY1ohHuXNVvRm4uWzZO8tev6vKe68DrosscRHK5Ys8/ZTW1acM+iYzyuULnHxUX+uO7btp1+qZHWhfmTSHJ2a4b98Rd9+1A1+14d2NMdGwnt4hW8rw3s3a3N+DiFM00+ocxvreJD2JLjdYNXds7713PHTQfV37GlYb3t0YEw0LGCFbyvDezeqOxziqr4edQ6NMzsy1NFg5ldVpHtg/xoGx5gKl995flAJG/WtYaXh3Y0w0LGCEbKjF/SA82UyKOx8eKf3d6mPveKT5Y3vvvfPhEaeVWYCRfsvHoDLGRMcCRsha3WnPk82kGJ+adf9ufbCaP/bSz7s/naA3GWN8ajbwAIaVhnc3xkTDAkbIvKalQZ6Ow+SvbG51sPIPgdLMcCgiUjqPoOcwWGF4d2NMNCxghCy3xOG9m+XdYAd6k/R2R9r4rcKxnZt8Mt7FhjXNdVb0ziNo4Kk0Yq4xJhoWMEK21OG9m+XdtNtzbOeY2Sb6YMzvq7HzsImUjGmd1j6KLlOHitNMzcyFsq+hkSLPOnVTKPtqROmm3caAEcaUsKXzCDgWVvnw7kHlx6eYqVDv0Z3oinymQmNWKgsYwNtuvJvv3rMvtP0ds761lc4Am9elSMa6OHZ9b8uPPdCbpK8nzrYQju2lP+h59CRibOprbF6Mf/+vHH/15V9XXNcl8O03/U8et3lt4P0Zs1pYwAAuOesYnnbShlD2Fe8SLnzC5lD21YhkvIvrLz+L4zeuafmxRYQvXHYWW/qbr+h/5qmbuO7S7Zw52B/4PY3Oi3F37hA9iS7e8UcL5vNi72iRj/74QXbnixYwjKnAAgbwhycvj2HRm7V920Dbjn1GAzf4WmJdwjNPPaqh9wwOpPnVo/nA2+fyRY4ZSPPys49dsHzX/iN89McPUpierfJOY1Y3q/Q2K55/ePcgcvlixVZYqaTz/FScmgk1fcZ0CgsYZsXzD+8eRLWWbKlEDIDilOUwjKnEAoZZ8fzDu9dzqDjNkYmZir3h00knYFiRlDGVWcAwK15pPvAAFd+1hm7pjnchYjkMY6qxgGFWvC3u8O5BOu8NjVQfHFJESCViFCxgGFORBQyz4nnDuwdpWusFlWqTM6WTMYpWJGVMRRYwTEfwpnetJ5cvsqY7zrpU5d7cqWTMiqSMqcIChukIQTvvebMCilQe8yqdiFOwZrXGVGQBw3SEbCbN3kP1+2LUGxyyJxmjOB3OuGLGdBoLGKYjZDMpZueUvYeqz4uhquzOF2tOMJVOxKzjnjFVWMAwHcGbeKlWsdSh4jRHJmdq5jDSSWslZUw1FjBMRwgyL0YuwHzrPVbpbUxVFjBMR9i8LuX2xaiewwgy33o6Yc1qjanGAobpCMl4F0evrd0Xw1tXa/pXK5IypjoLGKZjZDMphuoUSfV1x1mbqj6qfyoZtyIpY6qwgGE6RjaTZnedIqmtNfpggDNi7dTsXOCh0o1ZTeoGDBHZISJvEJFMKxJkzFINZlLsPVRkusrNfmikdpNamB+x1uoxjFksSA7jYmALcKeI3CAi50utRzRj2iSbSTOnsK9CXwxVJZcvVB1DypNK2pwYxlRTN2Co6i5VfQdwMnA9cB3wqIi8W0TaNyeoMWW81k+V6jFGC9OMT80GzmFYxbcxiwWqwxCR/wF8EPhH4GvAi4DDwA+jS5oxjfGCQW5kcT3GfB+MOjmMhBVJGVNN9eYiLhG5CxgFPg1coarePJh3iMjTokycMY04el0PXVXmxQjSBwPmi6Qsh2HMYnUDBvBiVX2o0gpV/eOQ02PMktXqixGklzdAOun8S1gdhjGLBSmSeo2I9HsvRCQjIu+NME3GLFl2IF0lYBRY21N9HgyPVyRlQ5wbs1iQgPFcVR31XqhqHrgwuiQZs3TVJlIaqjNKrSdlzWqNqSpIwIiJSLf3QkRSQHeN7Y1pm2wmzb7DE0zNLOyLUW8eDE/amtUaU1WQgPFF4FYRuUxEXg18H/hctMkyZmmymRRzCnsPzRdLOX0wguUwrFmtMdXVrfRW1Q+IyG+AZwECvEdVb4k8ZcYswfww50WOXd8LQL4wTWFqNlAOo8ea1RpTVZBWUqjqd4DvNLpzEbkA+AgQAz6lqu+rsM1LgHcBCvxaVV/mLp8FfuNu9qiqPr/R45vVxxuJ1l+PEbRJLUB3vIsusSIpYyoJ0g/jbOBfgMcBSZyb/7iqrq3zvhhwDfAcIIcztMhNqnqvb5uTgLcDT1PVvIhs8u2iqKpnNHpCZnXbvK6HWJcsaClVGtZ8oH6RlIiQTsatSMqYCoLUYfwrcAnwAJACXoMTQOp5CrBLVR9S1SngBuCism0uB65xW16hqvuDJtyYSuKxxX0xhkacHMbWADkMcFpKFaetWa0x5QINDaKqu4CYqs6q6meAZwR421ZgyPc65y7zOxk4WURuE5Hb3SIsT487Uu7tIvKCagcRkde62+0YHh4Ocjqmw2UzqVKQACeHsS6VYG1P7T4YnlTCpmk1ppIgdRgFEUkCO0XkA8BeoDfA+yqNaKsVjn8S8HQgC/yniDzB7fdxjKruEZHjgR+KyG9U9cFFO1T9BPAJgO3bt5fv36xC2Uya23YdKL0O2qTWY7PuGVNZkBzGy93t3giMA4PAnwR4X87d1pMF9lTY5puqOq2qDwP34wQQVHWP+/sh4MfAmQGOaQzZTIrHjkwwOePc9J0mtcEDhlMkZQHDmHI1A4Zbcf1/VHVCVQ+r6rtV9a/dIqp67gROEpHj3BzKxcBNZdt8A7d4S0Q24BRRPeQOP9LtW/404F6MCWBwII0q7B2daKgPhieVsByGMZXULJJS1VkR2SgiSbfiOjBVnRGRNwK34LSsuk5V7xGRq4AdqnqTu+48EbkXmAXeqqoHReSpwLUiMocT1N7nb11lTC3+vhh9PXGK07MMNlgkNVqYjip5xqxYQeowHgFuE5GbcIqkAFDVD9V7o6reDNxctuydvr8V+Gv3x7/Nz4EnBkibMYvMB4wCa3ri7rIGchjJuBVJGVNBkICxx/3pAvqiTY4xzTt6rdMXY8gfMOpMzeqXTsRstFpjKggyNMi7W5EQY8ISj3WxeV2PWyTlNKXd2t9gpbfVYRizSJCe3j9icXNYVPWZkaTImBA4w5w7dRj96UQpcARhraSMqSxIkdRbfH/34DSptfy6WdYGM2l++sAwfT3x0vhSQaUTMaZnlenZORKxQH1bjVkVghRJ3VW26DYR+UlE6TEmFNlMmscOT9IdH+e0LTWHPVvEP6/3upQFDGM8df8bRGTA97NBRM4Hjm5B2oxZMq+l1KMjjfXyhvmAMWHFUsYsEKRI6i6cOgzBKYp6GLgsykQZ0yx/kGikSS3YJErGVBOkSOq4ViTEmDBlfUOZN5zDSDj/Fta01piFghRJvUFE+n2vMyLy+miTZUxzjl7bQ7zLGf+y0RxGyub1NqaiIDV6l7ujxwLgzl1xeXRJMqZ5sS5hi9v3otEchlckZU1rjVkoSB1Gl4iIO4yHNyBhMtpkGdO8bCbF2OQMvd2BZiIuSSWiq8PYM1rkC7f/nrecdwqxrkozAETnscMT/MPNv2VyZm7RukSsi7eef0qgWQlb5fO/eIRfPHiw3cloWjLexRXPPZXN6xp7cFmOgvwn3QJ8RUQ+jlP5/Trgu5GmypgQvPDMrfz+YKH+hmXSERZJffvuvXzsxw/ywjO3cvJRrR1p50f37ecbO/dw/MbeUnEdgCo8sH+M0wf7uezc5VNl+c+3PsDMnLKpr7vdSVmy2TnlweFxzjpuPS8765h2J6dpQQLG24DXAn+B01Lqe8CnokyUMWF48fbB+htVkIqwSCqXL5R+tzpg5PJFYl3C9/7yD4j7OiSqKk+48pZS2paD4tQsB8ameOv5p/CGZ5zY7uQs2eyccsrffWdZXdtmBAkYKeCTqvpxKBVJdQOdcQWMKZMutZKKImAUF/xupVy+wOZ1PQuCBYCIkM2k25KmanaPOreXRuuflhuvLm05XdtmBKn0vhUnaHhSwA+iSY4x7TffSir8ZrVD7pOmf87xVhmqMfNg+Tzo7TY04txgV3rAAPfadkgOI0jA6FHVMe+F+/fyqRkzJmSJmBDrktBzGN7sf9C+HEa1cbUGB9Lszhdx27a0nVeE0+g4YMvR4DKk9+HXAAAZZElEQVTLvTUjSMAYF5EneS9E5MlAZ5y9MRWICOlE+CPW5gvTpSDU6hvI5Mwsjx2erNonJZtJcWRyhsPF5dFZMZcvkox3sWHNyq3w9mQzKYaPTHbEUDNB6jD+EviqiOxxX28GXhpdkoxpvyjmxPCemjf1dbe8EnTP6ARQvYjHWz6UL7Auva5l6aomly+S7U/R1eKmx1HwJu/aPVrkhI1r2pya5tTNYajqncCpOK2kXg88rsIItsZ0lHQyFnqRlJerOOeE9eQL04xNtu5p3gtQ1QNG2t1ueRQe5PIFtnZA/QUsv2vbjKBjN58CPB44E7hERF4RXZKMab+eRBQBw7lpn3XcegB2t/AG4t2sslU65vnnQV8Ocvliw0O6LFfL7do2I8hYUlcC/+L+PAP4APD8iNNlTFulk7HQy5yHRoqs7YnzeHd+jla2ShoaKRDvEo5e21Nx/bpUgjXd8WXxFDw+OcPB8amOaCEFsKmvh0RMSi2/VrIgOYwXAc8C9qnqq4DTcfphGNOx0sl46KPV5vIFspl0W544c/kiW/pTVYcjcfpipJbFU/DuUefGupyGKWlGrEvY2r88rm2zggSMoqrOATMishbYDxwfbbKMaa9oiqScfhDre5P0JLpa+jTvBKvaT+zLpfNevfqWlWi5XNtmBQkYO9zhzT+JM5nSr4BfRpoqY9os7CIprw/G4EC6LT2rczU67XmcHEb7+2KU6ls6KmB0Rm/vIBMoeXNffFxEvgusVdW7vfUicpqq3hNVAo1ph7BbSY2MT1Gcni3dBAczKXKjrSmimJieZf+R6n0wPN7ovoeK0/Sn2zcgdS5fpDvexcYO6IPhyWZSHBhz+mL0uKMhr0QNzXCvqo/4g4XrCyGmx5hlIex+GEOlp+Z06XerKkG9OoEgRVJA2ytnh0acJrUiK78Phme+ae3KrsdoKGBU0TmfqjGuVCJGYXo2tOKZ8nL5bCbFoeI0hyemQ9l/7WMHq0QeHFgezT9z+WJHDAni513boRVeLBVGwFgeg88YE6J0MsbsnDI9G1bAWPiU7z1xtqIvRtBK5OXSwSxIBf1Ks1yubbPCCBjGdJxU0qneC6tYKpcv0J9O0NeTAPyduVoRMIokYsKmvsp9MDzrUgn6euJtzWGMTc6QL0x3TKc9z8Y13SRjXW3PvTUrjIAxFcI+jFlWvFn3CtPh9MUob6XUyr4Y9fpg+LW7+efuDmwhBdDVJWztgJZSQXp631prmaqeHXaijGm3sOf1dgbTm39qHuhNkk7GWpTDCF7E0+7mn53YB8PT7msbhqoBQ0R6RGQA2CAiGREZcH+2AVtalUBj2iEV4rzeTh+MhTdtr2d1K4YHGRpZGKxq8Sb7aVdfDO96dFqRFLgBYxlNUrUUtfph/DnO0OZbcDrsefnZw8A1EafLmLZKhziv94GxKSam5xY9Nbei+GdiepYDY5OlVjr1DGbSFKZmyRemGehtfV+MXL5IT6KLDWva1w8kKtlMmoPjUxSmZkgng8wssfxUzWGo6kdU9TjgLap6vKoe5/6crqr/2sI0GtNyYRZJzRezLHxqbsXYTbmy/h/1tHtkVW+U2k7qg+Hxrm0rRykOW5BK730i0gcgIn8nIl/3z8BnTCcKs0iqWj+IbCbF4QmnZ3VUGq0TaHfzz9xo5zWp9bT72oYhSMD4e1U9IiLnAucDnwM+Fm2yjGkvr8igGEIrKe8GUT4hUCv6YjSaw9i6LHIYnRkwBjtgXowgAcN7xPoj4GOq+k2g8woYjfEpNasNqUgqk3bmm/AbbMFwEfN9MIKNy7QulWBtT3vmxTgyMc1oB/bB8GxY000y3tpRisMWJGDsFpFrgZcAN4tId8D3GbNieQPEhVEkNVRl9rj5ebSju4EM5QtsbXBubGecq9Y/BXfiKLV+XV1Ctt9phbZSBbnxvwS4BbhAVUeBAeCtQXYuIheIyP0isktErqiyzUtE5F4RuUdErvctf6WIPOD+vDLI8YwJSzrUOozK5fL96QS9yVjkOYxGJyIaHGhPf4FSXU+H5jDAmSK3o3MYqlrAmTTpXHfRDPBAvfeJSAyn+e1zceYDv0REHl+2zUnA24GnqeppOM14cft/XAmcBTwFuFJEMgHPyZimJWJdJGJCoclmtarK7io37VbMi7F7CeMyeWlqdV+MTu6051npnfeCzun9NpwbO0AC+GKAfT8F2KWqD6nqFHADcFHZNpcD16hqHkBV97vLzwe+r6oj7rrvAxcEOKYxoelJND/E+fDYJJMzi/tgeKK8gRSnZjkwNtVwnUA2k6I4PcvIeGtH/cnli6QSsbb0/2iVbCbFyPgU45PhTv/bKkGKpF4IPB8YB1DVPUBfgPdtBYZ8r3PuMr+TgZNF5DYRuV1ELmjgvQCIyGtFZIeI7BgeHg6QLGOCcSZRau4fu165fJR9MXaPLu2JvV3NP72iu07sg+EptYwbXZm5jCABY0qdvKkCiEhvwH1X+tTL87hx4CTg6cAlwKfc6WCDvNdZqPoJVd2uqts3btwYMGnG1JdOxilOzzW1j3pDXQwOpDkyMcOhQvh9MbyJkBoPGF5lfGsrZ4dGOrdJrad0bVfoECFBAsZX3FZS/SJyOfADnPm968kBg77XWWBPhW2+qarTqvowcD9OAAnyXmMilUrEKIaUw9jaXz2HAdHcnL2cS6OVyK0cet0vly80XEG/0gyu8M57QQLGRuBG4GvAKcA7cW7g9dwJnCQix4lIErgYuKlsm28AzwAQkQ04RVQP4bTKOs8d9DADnOcuM6ZlUiHM653LF1nfm6S3u/LYQVEW/+TyRZLxLjY0ODd2X0+C/nSipR3MnNkHZzo+h7FhTZLu+MqdFyPICFjPUdW34VQ8AyAiH8SpCK9KVWdE5I04N/oYcJ2q3iMiVwE7VPUm5gPDvTgdBN+qqgfdY7wHJ+gAXKWqIw2emzFNSSdjjDVZOVlvaPEox25yhlRvrA+Gp9WteXY32CN9pfJGKV6pOYyqAUNE/gJ4PXC8iNztW9UH3BZk56p6M3Bz2bJ3+v5W4K/dn/L3XgdcF+Q4xkQhlYgxfGSyqX3szhd53Oa1VdevSzk9wKPJYRQWDUcSVLY/za7hsZBTVN1qaFLrafckVc2oVSR1PfA8nGKk5/l+nqyqf9aCtBnTVukmi6Tm5pTcaO2K3CifOHNVepgH4bXealVfjEbHvFrJWjFKcVSq5jBU9RBwCKf1kjGrTrN1GMNjk0zV6IPhcZ44w72BjE/OcHB8aslP7NlMionpOQ6MTbEx4DhUzRjKF0gnY2TSiciP1W7ZTJp8YZqxyZlF44stdzYmlDFVpBJxJpro6V1tHoxyXg4jzKd5r53/Ulsdee9r1ZNwLl9ksEPnwSjnTWa1EnMZFjCMqcLruLfUG/n8PBj1chgpxibDnRej2TqBVnfe6+RhzcuVru3IyqvHsIBhTBWpZIw5hcmZpXXem++DUS+HEf7NudmRX7e2uC9GvdZknaTdsxo2wwKGMVV407QutVgqly+wYU2yNHtfNVHcQHL5It3xLjY22AfDs6Y7TqZFfTEOFac5MjGzKiq8Adb3JulJrMx5MSxgGFNFs5Mo5fJFtga4CUbR+9drUttMnUCrmn+upia10JpRiqNiAcOYKlJNBoyhkWDFLOvSCfp64qGOL+SMy9TcE3s205rJfubHvFodOQxo3bUNmwUMY6pINTHr3tycsnu0GHgcp7CfOHP5QmkO6aUaHEizuwXzYpTGvKrTOKCTDFoOw5jOkk46beSLS6jD2H9kkulZDVzMEmbnvbHJGfIhzI2dzaSYnJljeKy53u715PJF1nTHWZfq/D4Ynmwm5Y6fFf4oxVGygGFMFfNFUo2PJ9VouXyYPat3N9lCyp8miL6llNekdjX0wfCU5sVYYbkMCxjGVNFMkVSjQ11kM2nGp2YZDWFejLAqkVvVF2M1Nan1tGsI+WatrH7pxrRQkFZS+w9P8ODw+KLldzx8EGgshwFwyz37OHZ90DnKKvv5g96xmyuS8ubwuP2hg0tunhtELl/k7OPXR7b/5cj7vH/+4IGKw4OcdNSahoelbwULGMZU4QWMWnUYl3/hLn49NFpx3db+FD2J2n0wPCduWgPAFV//TYOprGxdKsGGNc3Njd3bHWfLuh6uv+NRrr/j0VDSVc0J7vmvFgO9STLpBJ+57RE+c9sji9Y/9YT1XH/52a1PWB0WMIypwqvDqFYkpao8uH+MC594NC8/e9ui9cesD/6Ef8LGNXz7TedyuNjc/Buerf3h1Al85XXnlJq9RiUeE84Y7I/0GMuNiHDTG8+tWCT1yf98iHv2HGpDquqzgGFMFV4dRrUiqUNFZ8TRJx2T4ZwTmi9SOW3Luqb3EbZsJr2q+ke00uBAuuLgkL98eIQf3refyZlZuuPBcqitYpXexlQRj3WRjHVVLZJaTXM4mNbx6jf2jE60OSWLWcAwpoZUMkaxSrPa1TakhWmN5Tw4oQUMY2qoNeueV7YftDe3MUFkB1o7tHwjLGAYU0MqEaNQtUiqQF93nLUpqwo04Tl6bQ/xLgl1bLGwWMAwpoZUMsZElRxGLl8kO7A6ZokzrRPrErb0RzPPe7MsYBhTQ60iqdU0S5xpLW+omOXGAoYxNfRUKZJS1VU5pIVpjTAHowyTBQxjakhXaSU1WphmfGrWmtSaSGQzafYfmVzybI9RsYBhTA3pZLxiP4wha1JrIuR9r3aPLq9chgUMY2pw+mEsDhi5kIYQN6aSVo0U3CgLGMbUkEpUrvSe77RnRVImfN7sg8ut4tsChjE1pJMxitOziyY2yuWLrO1ZXbPEmdbZ1NdDIiaWwzBmJUklY6jC5MzcguVOk1rLXZhoLNe+GBYwjKmh2oi1QyPWpNZEazn2xbCAYUwN6Qrzejt9MCyHYaKV7U9HPhdJoyxgGFNDKumME+VvDz8yPkVxetZyGCZS2UyKA2PLqy+GBQxjakhXKJLyypUrTX5jTFgGl+GotRYwjKkhlaweMCyHYaK0HOfFsIBhTA2V5vX2/oG3WsAwEVqOnfcsYBhTg1fp7R8eZChfYF0qwdoe64NhorOpr3vZ9cWwgGFMDemEU+ldXiRlxVEmal1dwtb+VGncsuXAAoYxNfQknX8R/4i1uXzRpmU1LTE4kF49OQwRuUBE7heRXSJyRYX1l4rIsIjsdH9e41s361t+U5TpNKaatNus1iuSsnkwTCtlMyl2L6McRmSTEYtIDLgGeA6QA+4UkZtU9d6yTb+sqm+ssIuiqp4RVfqMCaK8p/fB8SkmpucsYJiWyGbSHBibojg1W2qA0U5R5jCeAuxS1YdUdQq4AbgowuMZE7pYl5CMd5VaSQ2N2Ci1pnXm58VYHrmMKAPGVmDI9zrnLiv3JyJyt4jcKCKDvuU9IrJDRG4XkRdUO4iIvNbdbsfw8HBISTdmnn9e71IfjAHLYZjoeQFjuQwREmXAkArLtOz1fwDbVPV/AD8APudbd4yqbgdeBlwtIidUOoiqfkJVt6vq9o0bN4aRbmMWSCdipTqM+U57lsMw0Zvvi9H5OYwc4M8xZIE9/g1U9aCqTrovPwk82bduj/v7IeDHwJkRptWYqvyz7uXyBTLpBGu6I6v+M6Zk45pukvGuZdNSKsqAcSdwkogcJyJJ4GJgQWsnEdnse/l84Lfu8oyIdLt/bwCeBpRXlhvTEqlkrDRarY1Sa1qpq0vILqN5MSJ7TFLVGRF5I3ALEAOuU9V7ROQqYIeq3gS8SUSeD8wAI8Cl7tsfB1wrInM4Qe19FVpXGdMS6UTcVyRV4OSj+tqcIrOabF1G82JEmq9W1ZuBm8uWvdP399uBt1d438+BJ0aZNmOCSiVjjBamSvNgPPPUTe1OkllFspk039uzr93JAKyntzF1pRJOK6nhsUkmZ+asSMq0VDaT4uD4FOOTM/U3jpgFDGPq8JrV2rDmph3m+2K0vx7DAoYxdaSSMSamZ23iJNMW8xMptb8ewwKGMXXM5zDceTD6LYdhWmd+IiXLYRiz7KXcjntDIwUGepP0Wh8M00Ib13TTvUz6YljAMKaOlDti7QOPjVn9hWk5EWFrJlUax6ydLGAYU4c3694D+y1gmPbIZpbHvBgWMIypwxtW+lBx2iZOMm0xuEw671nAMKYOb04MsCa1pj2ymTT5wjRjbe6LYQHDmDrSSX/AsByGab1SX4w2F0tZwDCmjlTSchimveab1ra3WMoChjF1+IuktlrAMG3g5Wzb3VLKAoYxdaTdZrXre5Olv41ppQ1rkvQk2t8XwwKGMXV4dRhZGxLEtImILIumtRYwjKnDq8Ow+gvTTtlMitxoe4ukLH9tTB1eHYYFDNNO2UyKnz1wgOd86CcV13/sz57MiZvWRJoGCxjG1NHbHeet55/C+acd1e6kmFXsxU8eJF+YRlUrru+OR19gJNUOvhJt375dd+zY0e5kGGPMiiIid6nq9nrbWR2GMcaYQCxgGGOMCcQChjHGmEAsYBhjjAnEAoYxxphALGAYY4wJxAKGMcaYQCxgGGOMCaSjOu6JyDDw+yW+fQNwIMTkrBR23qvLaj1vWL3nHuS8j1XVjfV21FEBoxkisiNIT8dOY+e9uqzW84bVe+5hnrcVSRljjAnEAoYxxphALGDM+0S7E9Amdt6ry2o9b1i95x7aeVsdhjHGmEAsh2GMMSYQCxjGGGMCsYABiMgFInK/iOwSkSvanZ4wicigiPxIRH4rIveIyJvd5QMi8n0RecD9nXGXi4j8s3st7haRJ7X3DJZORGIi8l8i8i339XEicod7zl8WkaS7vNt9vctdv62d6W6WiPSLyI0icp/7uZ+zSj7vv3K/4/8tIl8SkZ5O/MxF5DoR2S8i/+1b1vDnKyKvdLd/QEReGeTYqz5giEgMuAZ4LvB44BIReXx7UxWqGeBvVPVxwNnAG9zzuwK4VVVPAm51X4NzHU5yf14LfKz1SQ7Nm4Hf+l6/H/iwe8554DJ3+WVAXlVPBD7sbreSfQT4rqqeCpyOcw06+vMWka3Am4DtqvoEIAZcTGd+5p8FLihb1tDnKyIDwJXAWcBTgCu9IFOTqq7qH+Ac4Bbf67cDb293uiI8328CzwHuBza7yzYD97t/Xwtc4tu+tN1K+gGy7j/OM4FvAYLT2zVe/rkDtwDnuH/H3e2k3eewxPNeCzxcnv5V8HlvBYaAAfcz/BZwfqd+5sA24L+X+vkClwDX+pYv2K7az6rPYTD/RfPk3GUdx812nwncARylqnsB3N+b3M065XpcDfxvYM59vR4YVdUZ97X/vErn7K4/5G6/Eh0PDAOfcYvjPiUivXT4562qu4F/Ah4F9uJ8hnexOj5zaPzzXdLnbgHDefIs13FtjUVkDfA14C9V9XCtTSssW1HXQ0T+F7BfVe/yL66wqQZYt9LEgScBH1PVM4Fx5osnKumIc3eLUy4CjgO2AL04xTHlOvEzr6XaeS7p/C1gOJF10Pc6C+xpU1oiISIJnGDxb6r6dXfxYyKy2V2/GdjvLu+E6/E04Pki8ghwA06x1NVAv4jE3W3851U6Z3f9OmCklQkOUQ7Iqeod7usbcQJIJ3/eAM8GHlbVYVWdBr4OPJXV8ZlD45/vkj53CxhwJ3CS25oiiVNRdlOb0xQaERHg08BvVfVDvlU3AV7LiFfi1G14y1/htq44GzjkZXVXClV9u6pmVXUbzuf5Q1X9U+BHwIvczcrP2bsWL3K3X5FPm6q6DxgSkVPcRc8C7qWDP2/Xo8DZIpJ2v/PeeXf8Z+5q9PO9BThPRDJu7uw8d1lt7a68WQ4/wIXA74AHgXe0Oz0hn9u5OFnNu4Gd7s+FOOW1twIPuL8H3O0Fp9XYg8BvcFqdtP08mjj/pwPfcv8+HvglsAv4KtDtLu9xX+9y1x/f7nQ3ec5nADvcz/wbQGY1fN7Au4H7gP8GvgB0d+JnDnwJp55mGiencNlSPl/g1e757wJeFeTYNjSIMcaYQKxIyhhjTCAWMIwxxgRiAcMYY0wgFjCMMcYEYgHDGGNMIBYwjKlDRN4lIm9Z6np3mxcEGdTS3VdBRDb5lo01lmJjomEBw5jWeAHOaMhBHAD+JsK0GLMkFjCMqUBE3iHOHCk/AE5xl50gIt8VkbtE5D9F5NQK71u0jYg8FXg+8I8istPdpta+rgNe6g5B7d93r4h8W0R+7c758NIIL4Exi8Trb2LM6iIiT8YZUuRMnP+RX+GMfPoJ4HWq+oCInAV8FGecKr9F26jqM0XkJpwe5ze6x7i1xr7GcILGm3HmLPBcAOxR1T9y97Eu7HM3phYLGMYs9j+Bf1fVAoB7s+/BGczuq85QRYAz9ESJOyJwzW0a2O6fgZ0i8kHfst8A/yQi78cJPv+5pLMzZoksYBhTWfmYOV04cyucUeM9QbYJtJ2qjorI9cDrfct+5+Z+LgT+QUS+p6pX1TmWMaGxOgxjFvsp8EIRSYlIH/A8oAA8LCIvhtJcyaf736TOPCPVtjkC9AXYzu9DwJ/jPtiJyBagoKpfxJksaMXOv21WJgsYxpRR1V8BX8YZ2fdrgFf086fAZSLya+AenAl7ylXb5gbgre4seCcE2ZeqHgD+nfniqicCvxSRncA7gPc2e67GNMJGqzXGGBOI5TCMMcYEYgHDGGNMIBYwjDHGBGIBwxhjTCAWMIwxxgRiAcMYY0wgFjCMMcYE8v8AT6nGSPrX3Q0AAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finished\n"
     ]
    }
   ],
   "source": [
    "print (\"start\")\n",
    "\n",
    "# 文本预处理\n",
    "folder_path = './Database/SogouC/Sample'\n",
    "\n",
    "all_words_list, train_data_list, train_class_list, test_data_list, test_class_list = text_processing(folder_path, test_rate=0.2)\n",
    "\n",
    "# 生成stopwords_set\n",
    "stopwords_file = './stopwords_cn.txt'\n",
    "stopwords_set = make_word_set(stopwords_file)\n",
    "\n",
    "# 文本特征提取和分类\n",
    "flag = 'sklearn'\n",
    "deleteNs = range(0, 1000, 20)\n",
    "test_accuracy_list = []\n",
    "for deleteN in deleteNs:\n",
    "    feature_words = vocab_select(all_words_list, deleteN, stopwords_set)\n",
    "    train_feature_list, test_feature_list = text_features(train_data_list, test_data_list, feature_words, flag)\n",
    "    test_accuracy = text_classifier(train_feature_list, test_feature_list, train_class_list, test_class_list, flag)\n",
    "    test_accuracy_list.append(test_accuracy)\n",
    "print(test_accuracy_list)\n",
    "\n",
    "# 结果评价\n",
    "plt.figure()\n",
    "plt.plot(deleteNs, test_accuracy_list)\n",
    "plt.title('Relationship of deleteNs and test_accuracy')\n",
    "plt.xlabel('deleteNs')\n",
    "plt.ylabel('test_accuracy')\n",
    "plt.show()\n",
    "#plt.savefig('result.png')\n",
    "\n",
    "print (\"finished\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.naive_bayes import ComplementNB, GaussianNB, BernoulliNB"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "start\n",
      "总样本数： 90\n",
      "[0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7368421052631579, 0.7894736842105263, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.7894736842105263, 0.7894736842105263, 0.8947368421052632, 0.8947368421052632, 0.8947368421052632, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8947368421052632, 0.8947368421052632, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.7894736842105263, 0.7894736842105263, 0.7894736842105263, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8421052631578947, 0.8947368421052632, 0.8947368421052632, 0.8947368421052632]\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEWCAYAAAB1xKBvAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3XmcZHV97//Xu7fqmZ6etXsQGJYBRwRXdC6KYjQqMpIb0EQjE41BCWgUYnygEa5GEfVGjYomwQU3XCKIJJpRETRI9MoPdQZZdEB0RIRh6xpmq+6eruqq/vz+OOf01NRUd52qrlPr5/l49KOrzlbfU8v5nO8uM8M555yrpKfZCXDOOdcePGA455yLxQOGc865WDxgOOeci8UDhnPOuVg8YDjnnIvFA0aLk/QCSdsXsP+nJf1jPdNU5jVM0uPnWPdqSd9P6HX/VtKjksYlrYqx/X2SXhxju6PDc+qrT0rbg6T/kfQ3zU6Ha10eMBogvFDtCy9sj0i6UtKSBF7nbEk/KV5mZm80s/fV+7XiMrN/N7OX1Pu4kvqBjwEvMbMlZvZYvV8jZjoOes8rbP+CMBhdXrL8J5LOrnsC60TSJZK+WqdjzXmD4VqbB4zG+VMzWwI8HTgRuLjJ6Wl3hwCDwNZmJ6QGE8BrJR3d5HS4eXRbDjMODxgNZmaPADcQBA4AJKUkfUTS/WERy6clLSq3v6SLJP1OUkbSXZJeHi4/Hvg0cHKYk9kdLr9S0vuL9j9X0jZJOyVtknRY0TqT9EZJv5W0S9LlkhSue7ykH0naI2mHpK+XJO3Fc+x3wB14+Bp/J+ne8Dj/LKns9zB8Xz4u6aHw7+PhsicA94Sb7Zb0wzn2/ytJf5D0mKR3lqzrKXovH5N0jaSVcxxnmaTPS3pY0oOS3i+pd573vNLnuRu4EnjPHK9X6b0u3vYbYa51j6QfS3pS0borw8/iu+H35WeSji1af6qkX4f7/hugOV5jA/B/gFeF53nHfO/LfOcg6cfhYe8Ij/Wqec5thaTvSEqH36vvSFpTtH6lpC+G341dkr5VtO5MSbdL2ht+xhvC5QcUS6oo56T9RZHnSLof+GGM93iRpI+G37M9CnKKi8L3/IKS87lT0svmOt+2YGb+l/AfcB/w4vDxGuCXwCeK1n8c2ASsBIaBbwP/FK57AbC9aNtXAocRBPtXEdytHhquOxv4SclrXwm8P3z8QmAH8AwgBfwr8OOibQ34DrAcOBJIAxvCdVcB7wxfdxA4JeZ+B6Qp3Pam8FyPBH4D/M0c79ulwE+B1cAo8P8B7wvXHR0eq2+OfU8AxoE/Cs/1Y0C+6HP4+/DYa8L1nwGuKnds4Fvh+qEwLT8H3jDPe17x8wQeB+wFjguX/wQ4u9J7XeY8Xx++Rip83dtLPvudwElAH/DvwNXhupHw9V8B9ANvDd+fuT6LS4Cvliyb732p9H15fIzfzSrgz4HF4Tl+A/hW0frvAl8HVoTn8Pxw+UnAHuDU8PUPB55Y+lssPa+iz/3L4TktivEeXw78T/gavcBzwu3+AvhZ0XZPAx4DBpp9PVrQtazZCeiGv/BLOg5kwi/kjcDycJ0ILvrHFm1/MvD78PELKAoYZY59O3Bm+Phs5g8Ynwc+XLRuCTANHB0+t5If9jXAReHjLwNXAGvKpGG+/Q5IU7jthqLnbwJunOPcfgecXvT8NOC+8HH0454rYLyb8OIYPh8CcuwPGHcDLypaf2j4XvQVH5ug6CsbXTzCbTcCN81xfrE/T+DDwNfDx8UBY873usL3bHmY7mVFn/3nitafDvw6fPxa4Kcl6d5OzIAR432p9H2pGDDK7Pd0YFfR5zUDrCiz3WeAy+b5LVYKGMfEeY8JgtE+4GlltksRBOt14fOPAJ+s9pxb7c+LpBrnZWY2THDBeCLBHR4Ed86LgVsl7Q6LNa4Plx9E0mvDrHa07ZOLjlXJYcAfoidmNk5w13N40TaPFD2eJAgqAP9AcFH5uaStkl5fcuy59ivngaLHfwjTVTG9FbYtt+/s65jZBMG5Ro4Cvln0Pt4NFAguhJRs1w88XLTtZwjuqMup5vP8EHCapKeVLK/0XgMQFot9MCxy2UtwMYQDvw9zfS6l749x4OdSSaX3JdY5zEfSYkmfCYt79gI/BpaHxV5HADvNbFeZXY8guNmo1ez7UOE9HiHIPR30WmaWJbhxeo2CIteNwFcWkKaW4JU6DWZmP5J0JcEdx8sIioj2AU8yswfn21fSUcBngRcBt5hZQdLt7C97rjT08EMEP/ToeEME2f55XzdM9yPAueF+pwD/LenHZrat0r5lHMH+yuojw3TNl94425Z6GDg+eiJpMcG5Rh4AXm9mN5fuqAMrox8guJMeMbN8mdcpfc9jf55m9pikjwPvK1ke973+S+BM4MUEF7JlwC7mqIso8TDB50D4Oip+Xi65Jc/nfV/q9H25EDgOeJaZPSLp6cBtBOf3ALBS0nIz210mbcdS3gRBQI88rsw2xec633u8A5gKX+uOMsf5EkGQ+AkwaWa3zJGmtuE5jOb4OHCqpKeb2QxBELhM0moASYdLOq3MfkMEX+Z0uN3rCHIYkUeBNZIG5njdrwGvk/R0SSng/xKUs95XKcGSXllU4bgrTEeh0n5zeHtYoXkE8BaCcuhyrgLeJWlU0ghBMVPcpp3XAv9b0inh+3EpB37fPw18IAzChK9xZulBzOxh4PvARyUtVVBZfqyk54ebHPCeV/l5QlC38hwODG5x3+thgov2YwQXwf87/1tygO8CT5L0ZwpaA/0d5S+ekUeBo8O75YrvS4VzeBQ4JkYahwmC724FDRLeE60IX/97wCfD71K/pD8KV3+e4Hv+ojBdh0t6YrjuduCscPv1BHU4ldJQ9j0OP+svAB+TdFiYGzk5/G0RBogZ4KN0QO4CPGA0hZmlCcp4ow517wC2AT8Ns73/TXBnVbrfXQRfvlsIfnRPAYrvkH9IcDf+iKQdZfa/MXzN/yC4wzwWOCtmsv8X8DNJ4wQVum8xs9/H3LfUfwG3Evx4v0vwAy/n/cAW4E6ChgK/CJdVZGZbgTcTBMmHCS5axR0gP0FwHt+XlCGoAH/WHId7LTAA3BUe51qCMnQo/57H+jzDdO4lqMsobqEV973+MkEx3YNh2n46R/rLve4OggYUHyS4GK7jwO9SqW+E/x+T9Ivw8Xzvy3zncAnwpbAo6y/mec2PA4sI7uR/SlC0V+yvCOqdfg2METRkwMx+DrwOuIyg8vtH7M9Z/yPB934X8F6C78d8Kr3HbyP4bm4mqLP4EAdeV79M8DutSx+WZlNYIeNcQ0gygorAWoqynGsrkl4LnGdmpzQ7LfXgOQznnEtAWG/2JoLWYh3BA4Zzrmkk/R8FHfhK/77X7LQtRFhnlSYoOq5U7NU2vEjKOedcLJ7DcM45F0ui/TAUjN/yCYIu858zsw+WrD+KoFnaKEELg9eY2fZw3V8D7wo3fb+ZfanS642MjNjRRx9dvxNwzrkucOutt+4ws7KdhYslViQV9sb8DcF4LtsJmp1tDJuGRtt8A/iOmX1J0guB15nZX4VtrrcA6wnab98KPHOOXp2z1q9fb1u2bEnkfJxzrlNJutXM1lfaLskiqZOAbWZ2r5nlgKsJekwWO4FgXCUIBqSL1p8G/MDMoq7/PwA2JJhW55xzFSQZMA7nwLFptnPgmEUQdKf/8/Dxy4FhBTOnxdnXOedcAyUZMMqNZ1Na/vU24PmSbgOeT9CbMh9z3+BFpPMkbZG0JZ1OLyS9zjnn5pFkwNjOgYOZraFk4Dgze8jM/szMTiQYOx8z2xNn36JjXGFm681s/ehoxTob55xzNUoyYGwG1klaGw7MdhbBmDKzJI1o/2xrFxO0mIJgRrqXhIOKrQBeEi5zzjnXJIkFjHDI4/MJLvR3A9eY2VZJl0o6I9zsBcA9kn5DMA/BB8J9dxIM+bw5/Ls0XOacc65JOqqntzerdc656sVtVusTKDnX4r59x0P89tFMVfusO2SYP31a3MkJkzc1XeCLN9/HvtzBc1D19vSw8VlHsHp4sAkpaz23/mEnP7qn+gY8r3n2Uaxemux76AHDuRZmZlx4zR3kCjMozjx6gBkM9PbwJ085lJ6emDsl7OZtO/jQ9b8GOOg8zGCwv4c3PH+uSfK6y4e+dw8/v29n7M87ctqTH+cBw7lulivMkCvM8PbTjuPNf/z4WPt88ebf895v38XufdOsHJpr8sXGenRvFoBbLn4hhy5bdMC6J7/nhtn1Dh7NTHHG0w7jXzae2OykHMQHH3SuhU1mg1lNhwZ6Y+8zOpwCIJ1pnYtwlJZVQ6mD1o0Op0iPt05amy2dyc5+hq3GA4ZzLWwiLPNfnIpfGDC6pAUDxvgUKxb3M9B38CVndEmKdGaqCalqPRPZPJO5ggcM51z1JnNRDqOKgBHlMMZb5yI8313z6HCqpYJbM0XvQxT0W40HDOda2EQ2ymG0f5GUB4zKoqI5z2E456pWSw5jSaqPwf6elroIp8ezc941jw6n2DuVZ2q60OBUtZ7ZHIYHDOdctcajHEYVld6SWD082DIBw8xIZ7JzNvmMAskOr/ie/cxWe8BwzlVrMqz0Hqqi0huCO9SxFgkYmWyeqemZuXMYS4PlrZLeZhrLTNHbI1Ysbo3m0KU8YDjXwiZqaFYLUcuj1rgAVypmacVWXc2SzmQZWTLQMh0uS3nAcK6FTdbQrBZaq29DpYCxugUr6ZullftggAcM51palMNY1F9lDmM4xe7JabL55lckVwoYK4cGkDxgwPyNA1qBBwznWthkLs+i/l56qyyiiC7Oj43nkkhWVSr1Lejr7WHV0EDL5IiayXMYzrmaTeQKDFXRByPSSvUC6fEs/b1i2aL+ObcZaaE6l2aZmTF2jOc8YDjnajOZzbO4ij4YkVbqvBdU5Kbmrcj1znuwazJHYca8SMo5V5uJXKGqPhiR1S3UVHUsk63Yr8ADxv7PKukhyhci0YAhaYOkeyRtk3RRmfVHSrpJ0m2S7pR0eri8X9KXJP1S0t2SLk4ync61qslcniVVtpCC/aPCtsJFOE65fNTRsJNmAK1Wq/fyhgQDhqRe4HLgpcAJwEZJJ5Rs9i6Cub5PBM4CPhkufyWQMrOnAM8E3iDp6KTS6lyrmsgWqm5SCzDQ18OKxf0tMQBhnIAxOpwiV5hh776DZ+TrFq0+8CAkm8M4CdhmZveaWQ64GjizZBsDloaPlwEPFS0fktQHLAJywN4E0+pcS5rM5avutBdphWKewoyxc6JyU9FWHGG30Vp94EFINmAcDjxQ9Hx7uKzYJcBrJG0HrgMuCJdfC0wADwP3Ax8xs53lXkTSeZK2SNqSTlc/D65zrWwiW6ip0htaI2A8NpFlxipfBKOA0gp1Ls2SzmRZPNBb9TAwjZRkwCjXJKK0gHIjcKWZrQFOB74iqYcgd1IADgPWAhdKOqbci5jZFWa23szWj46O1i/1zrWAyVy+pma1EA4P0uS+DXHL5VupVVeztHofDEg2YGwHjih6vob9RU6Rc4BrAMzsFmAQGAH+ErjezKbNbAy4GVifYFqda0lBK6mF5TCaWZHsASO+dKa1e3lDsgFjM7BO0lpJAwSV2ptKtrkfeBGApOMJAkY6XP5CBYaAZwO/TjCtzrWc6cIMufxMzXUYq4cHmZqeIZNtXkXybFPR4fmbii4d7GOgr7Xm8Gi0sczUbHPoVpVYwDCzPHA+cANwN0FrqK2SLpV0RrjZhcC5ku4ArgLOtuB26HJgCfArgsDzRTO7M6m0OteKosmTamklBa1x1x699kiFO2dJLTXCbjO0Qw4j0doVM7uOoDK7eNm7ix7fBTy3zH7jBE1rneta0fSsC2klBcGF6NjRJXVLVzXSmSzDqT4WxTiHVhpht9Gmpgvsncp3dR2Gc24Bah3aPNISOYzx+BW5q1ugVVez7GiDJrXgAcO5llXr5EmRVhiAMJ3JMhLzItgKzYCbpR16eYMHDOda1kSUw6ixldSyRf3096qpxTw7qmgqOjqcYudkjunCTMKpaj37e3m37jhS4AHDuZY1GeUwauyH0dOjpg8bXk1F7uhwCjPYOdH8OTwarR16eYMHDOda1kJzGNDceoF9uQKZbD52U9FWKEJrlnQmiwSrlgw0Oynz8oDhXIuKmtXWmsOA4I61WcNtVDuYXnR3PZbpvvGkxjJZVi4eoL+3tS/JrZ0657pY1Kx2ITmMZlYkRwMJVlOHAd2bw2j14ijwgOFcy5rNYdTYSgqCu/udE1kKM40fHqTalj8jXV4k5QHDOVeziVyeVF8PfQsophgdTjFjwaixjVZtwBjs72XZov7uDRgt3ssbPGA417Ims4UFD3XdzGKedCZLj/bP/hdHN/b2NrOqOjg2kwcM51rURC5f03zexZoaMMazrBxK0dtTbqaD8rpxPKm9U3ly+RkPGM652k1mCwwtoMIb9ncEa1YOo9qLYDf29m6XXt7gAcO5ljWRy7N4AU1qobipauMvwmOZLKtrCBjdNute1IzYA4ZzrmaTuYXnMBYN9DKc6murHMZkrjDbpLgbRJ9NtcG1GTxgONeiJrILr8OA5lQkz8wYO2qoyO3G3t7tMo4UeMBwrmVN5PILbiUFMNKEeoE9+6aZLljVTUVnK+m7qKVUejzLQG8PSxclOj1RXSQaMCRtkHSPpG2SLiqz/khJN0m6TdKdkk4vWvdUSbdI2irpl5JaP/w6V0eT2ULdchg7Ghwwah1MLxp3qttyGKPDKaT4rcmaJbGAIamXYKrVlwInABslnVCy2bsIpm49kWDO70+G+/YBXwXeaGZPAl4ATCeVVudaUb1yGM1oqlpry59uLZKKO2dIsyWZwzgJ2GZm95pZDrgaOLNkGwOWho+XAQ+Fj18C3GlmdwCY2WNmVkgwrc61lMKMMTU9U7ccRiabZ1+ucT+hWgPGisUD9Pao6wJGO/TyhmQDxuHAA0XPt4fLil0CvEbSdoK5vy8Ilz8BMEk3SPqFpH+Y60UknSdpi6Qt6XS6fql3romi6VkX2koK9re+aeRFOGoqWm3Ln2AOj4GuGrE2ncnGHgK+2ZIMGOUK5EpHQNsIXGlma4DTga9I6gH6gFOAV4f/Xy7pReVexMyuMLP1ZrZ+dHS0fql3romigQcX2g8DiiuSG3cRTmeyDPb3sKSGIrVu6rw3XZhh52TOcxgEOYojip6vYX+RU+Qc4BoAM7sFGARGwn1/ZGY7zGySIPfxjATT6lxLifoh1COH0YzhQRZSkTu6pHvGk9o5kcOsPTrtQbIBYzOwTtJaSQMEldqbSra5H3gRgKTjCQJGGrgBeKqkxWEF+POBuxJMq3MtZTaHUac6DGhwwBivvVy+m3IY7TQsCCQYMMwsD5xPcPG/m6A11FZJl0o6I9zsQuBcSXcAVwFnW2AX8DGCoHM78Asz+25SaXWu1UQ5jFqKdEqtGkrRo+bkMGoxOpxix3iOmSbM4dFo7RYwEu0pYmbXERQnFS97d9Hju4DnzrHvVwma1jrXdfbXYSz8J9rbI1YONbaYJ53JctLalTXtu3p4kMKMsWsyx6o2KduvVbXT2Dab9/R2rgVNzLaSWniRFDS2mCeXn2HX5HTNQ110U2/vWjs4NosHDOda0GS2fjkMCJq3NipgRLP71dpUtJvm9k5nsiwd7GOwvz43BknzgOFcC0oih9GoYcPH9i6smCXaLzpOJxvLTLVN7gI8YDjXkva3kqpPDiOoSM42pCJ5oRW5XVUktYDGAc3gAcO5FjSRzdPfKwb66vMTHV2SYrpg7NmX/JBsCy2XH0r1sXigt2uKpEaH22dcVQ8YzrWgyVyhbrkLaOxde3ShX7VkoOZjdEtfjHYaRwo8YDjXksaz+brVX0BjK5LTmSzLF/eT6qs9/c0YYbfRJrJ5JnIFL5Jyzi3MZC5ftxZS0PiAsdC75tVLO394kB1t1qQWPGA415ImsoW65jAaOWJtenzho692Qw6jnebyjnjAcK4FTebyda3DWJLqY7C/pyHDho9lphacwxgdTrFn3zRT0507Dc5Ymw0LAh4wnGtJE9kCQ3UY2jwiqSEVyWZWl6ai0f47OrhYqt3GkQIPGM61pHrnMKAxw4aPZ/NMTc/ULWB0crFUOpOlt0esWFx7a7JG84DhXAuayNU3hwGNaapar7vmaByqTg8Yq4aCKWnbhQcM51rQZDaBHEYjA0aNAw9GuqG3d3q8vXp5gwcM51rOzIwxOV1gqI7NaiG4iO+anCaXn6nrcYvVa/TVqNNfp+cw2i1gJDofhnPV2jM5zUd/cA/7cu3fOubpRy7n1c86qur9pvIFzOo38GAkujhd+I07GKzTkCOlfpceP+C1atXf28PKoQG+c+fDPLhrX+z9Dl02yFtPfUJNU8PGcfsDu/naz/6A1WFIrnvT4zzxcYcu/EANlGjAkLQB+ATQC3zOzD5Ysv5I4EvA8nCbi8JJl4rX3wVcYmYfSTKtrjX8ZNsOvnzLHxgdTtHfRmW7pfZO5bn+V4/UFDAm6jy0eeSZR61g7cgQt963s67HLfWcY1exfFH/go9z2pMex4/uGePmbTtibT85XWD35DRnnXQkhy1ftODXL+erP/0D37ztQQ6pQ85g+eIB/ugJo3VIVeMkFjAk9QKXA6cC24HNkjaFs+xF3kUwdeunJJ1AMDvf0UXrLwO+l1QaXetJh/0Ern/L89p6trXLb9rGP99wD1PTharnOpis89DmkeMeN8xNb3tBXY+ZpH/6s6dUtf0P7nqUc7+8hR3j2cQCRjqT5UmHLWXT+ackcvxWVzFfKmmLpDdLWlHlsU8CtpnZvWaWA64GzizZxoCl4eNlwENFr/sy4F5ga5Wv69pYejxLX5s1NSwn6rhWSxn8bA6jzpXena4RTXHbbbDAeotTkHkWcBhBDuFqSacpXgHh4cADRc+3h8uKXQK8RtJ2gtzFBQCShoB3AO+t9CKSzguD2pZ0Oh0jWa6Vje3NMrIkRU8bF0cBjC6tvZXPbA6jzs1qO100xEaSE0WNZRY+7Ek7qxgwzGybmb0TeALwNeALwP2S3itpvlney/3iS6uKNgJXmtka4HTgK5J6CALFZWY2HiN9V5jZejNbPzraXuWB7mDt2NSwnIXMGjdR58mTukXSLasKM8bOie7OYcT6Rkp6KvA6gov6fwD/DpwC/BB4+hy7bQeOKHq+hqIip9A5wAYAM7tF0iAwAjwLeIWkDxNUiM9ImjKzf4uTXte+0pkshyxtnwll5rJ6Af0IJrOew6hFqq+X5Yv7EwsYj01kmbH2Gsqj3ioGDEm3AruBzxO0Yoo+jZ9Jeu48u24G1klaCzxIULT1lyXb3A+8CLhS0vHAIJA2s+cVvf4lwLgHi+6QzmR58mHLmp2MBVs5NIBU293ueBQwPIdRtSRHuW3HsZ/qLc438pVmdm+5FWb2Z3PtZGZ5SecDNxA0mf2CmW2VdCmwxcw2ARcCn5X0VoLiqrPN6tHC2bWjwozx2ESuI36Qfb09rBoaqOnitX8+b89hVGt0OLnxsjxgxAsYfyPpw2a2GyBsLXWhmb2r0o5hn4rrSpa9u+jxXcB8uRTM7JIYaXQdYNdkjsKMdcwPcqTGu92J2Upvz2FUa3Q4xW33707k2PUa9qSdxWkl9dIoWACY2S6Cugzn6qrT7uBqvdudzBboEaQS6o3dyaIiqSQKKuo17Ek7i/ON7JU0+w5JWgR07zvmEtOOM5DNZ/XwIDtqzGEMDfQlNrxFJ1u9NMW+6cJsS7N6SmeyDKf6WNTFRYVx8rxfBW6U9EWCeobXEwzn4VxdteMMZPOJRoc1s6ou/pPZAou9hVRNou/O2N4plowuqeuxx9pwsMB6qxgwzOzDkn5J0JpJwPvM7IbEU+a6TpTDGOmQdu6jwylyhRn27JtmeRU916Mchqte8Twax9Q5YKQzWUY8YFRmZt/Dx3RyCUtnsgwN9HZMZW/xUBXVBIzJnOcwapXkPBo7MlmOP2xp5Q07WJyxpJ4tabOkcUk5SQVJexuRONddOqWXd6TW8aQmsp7DqFWS40l1+zhSEK/S+98IhvD4LbAI+BvgX5NMlOtO6cxUZwWMGu92J3P1nzypWyxf1E9fj+oeMPblCmSy+Y76ftYiVrs9M9sG9JpZwcy+CPxxssly3agdZyCbT613uxO5vHfaq1FPj2ru/zKfHd6kFohXhzEpaQC4PRzb6WFgKNlkuW6UzmQ55fEjzU5G3Swd7GOgr6fqi9dktuBFUguQRG/vTmvBV6s4OYy/Crc7H5ggGFDwz5NMlOs+U9MF9k7lWd0BAw9GJLF6uPq73Ylc3iu9F2D1cKqmUYLnE03s1Sl9hGo1721MOGveB8zsNcAUMeancK4W+4dd6KwfZLV3u2YW1GF4DqNmo8Mp7nxwT12P2WmjENRq3hyGmRWA0bBIyrnEdOqwC6NLqrvbzeZnKMyY5zAWYHQ4xWPjWQoz9RseJJ3J0iNYNdRZ389qxbmNuQ+4WdImgiIpAMzsY0klynWfTr2DGx1OseUPu2JvH41U6zmM2o0Op5gx2FnHkY/T41lWDqXobfOZIBcqzrfyofCvBxhONjmuW3VywNg5kWO6MEN/b+Uqw4lwLgxvJVW74v4vdQsYHdaCr1ZxhgbxeguXuHQmixRMPNRJoovMY+M5HrescoW+D22+cEn09vaAEYgz495NHDwXN2b2wkRS5LpSejzLysUDse7C20nx3W6sgJH1yZMWKone3ulMlsev9gKWOLcxbyt6PEjQpDYf5+CSNgCfIJhx73Nm9sGS9UcSjHy7PNzmIjO7TtKpwAeBASAHvN3MfhjnNV176tQ7uKiZcHp8Cqg89eyk5zAWbHbE2rAp7EKZGenxLKuXdt73s1pxiqRuLVl0s6QfVdovbJJ7OXAqsB3YLGlTOMte5F3ANWb2KUknEMzOdzSwA/hTM3tI0pMJpnk9PM4JufbUqUNHV3u36zmMhVs80MeSVF/dchi7J6eZLljHNfmuRZwiqZVFT3uAZwKPi3Hsk4Bt0Xzgkq4GzgSKA4YB0fCPywgq1zGz24q22QoMSkqZWTKT9bqm25HJcuxo5w0gMLIkqJOJe/GazWF4K6kFGa2hw+RcOrXJdy3ifCvWnhjCAAAX9ElEQVRvJbiwi6Ao6vfAOTH2Oxx4oOj5duBZJdtcAnxf0gUEw428uMxx/hy4ba5gIek84DyAI488MkayXKsxs44tkkr19bJsUX/8HEbYrNb7YSzMaB3Hk+rUFny1iFMktbbGY5drsFxaeb4RuNLMPirpZOArkp5sZjMAkp4EfAh4yTzpuwK4AmD9+vX1n8jXJW7vvjy5wkzHZvlHh1OzYxFVMpn1HEY9jA6nuPuR+szC4AFjvzjzYbxZ0vKi5yskvSnGsbcTjDsVWUNY5FTkHOAaADO7haBSfSR8nTXAN4HXmtnvYryea1NBhXDn/iCrududyBWQYFG/5zAWoq5FUh4wZsVpw3iume2OnpjZLuDcGPttBtZJWhsOLXIWsKlkm/sJpn5F0vEEASMdBqjvAheb2c0xXsu1sU4fCbSa8aQms3kW9/fS0+U9ihdqdDhFZirP1HRhwcdKj2dJ9fUw7C3XYgWMHhXNYB+2fqrYu8rM8gQj3N4A3E3QGmqrpEslnRFudiFwrqQ7gKuAs83Mwv0eD/yjpNvDv9VVnZlrG9Ed3Orhzhmptlg1I9ZO5Aos9gvTgtWzL0Y6EzSpLboMdq0438wbgGskfZqgDuKNwPVxDm5m1xE0lS1e9u6ix3cBzy2z3/uB98d5Ddf+Oj3LPzqcYjJXCKZerRAMJnN5hrxJ7YLt74uR5YiVixd0rLHMVMfWr1UrTsB4B0ErpL8lqMj+PvC5JBPluks6k2Wgr4elg515Z118t1spYExkCyz2Cu8Fq3U+9XLSmSxrRzqvyXct4nwzFwGfNbNPw2yRVAqYTDJhrnukM1lGl3Rulr94bKOjK1x4JnN5hrxJ7YKtruN4UulMlpPWrqy8YReIU4dxI0HQiCwC/juZ5LhulB7vzD4YkWrK0ydynsOoh5VDA0gLz2Hk8jPsmpxmdEln1q9VK07AGDSz8ehJ+HhhhYLOFenUTnuRqHhkbG/lsY0ms57DqIe+3h5WDQ0sOGA8NtHZ9WvVihMwJiQ9I3oi6ZnAvuSS5LpNpweMFYsH6O1RrOKRiWzecxh1MlKH3t6d3iCjWnG+mX8PfENS1OnuUOBVySXJdZPpwgw7J3Md3Qqlp0eMLIl3tzuRK3grqTqpdj71cjxgHCjO0CCbJT0ROI6gldSvzWw68ZS5rrBzIocZHT909OrhwVgBYzKX934YdbJ6eJDfje1Y0DHGZvsIdfb3M66438zjgBMIemKfKAkz+3JyyXLdYvYOroNzGBCNJzV/HUYuP8N0wTyHUSdRDsPMam6BF30/Vy3prJkgaxVnLKn3AP8a/v0x8GHgjHl3ci6m6CLa6Vn+OONJRUObex1GfYwOp5guGHv21V4gks5kWb64n1SfB3GIV+n9CoLxnh4xs9cBTyPoh+HcgnVLGfHocIod4zlmZuYeUDka2txbSdVHPYYHifoIuUCcgLEvHG48L2kpMAYck2yyXLeIfswjHf6jHB1OUZgxdk3m5twmGtrccxj1UY/e3p3eR6hacQLGlnD02M8STKb0C+DniabKdY10JsvSwT4GO3w47+KxjebiOYz6Gq1Db+9Ob/JdrTitpKK5Lz4t6XpgqZndGa2X9CQz25pUAl1n65Y7uOLikeMPLb+NT55UXwstkpqdCbLDc7/ViJPDmGVm9xUHi9BX6pge12XSmWzHDmtebHWMi9f+HIYHjHpYOthHqq8n9myHpcazefZNFzq+yXc1qgoYc+jMEeNcQ3RLlj+qo5mveGR/KykvkqoHSQuaea9bGmRUox4Bw+fRdjUb65KAMZTqY2igd/4cRtZzGPVWl4DhAw/OqkfAmJOkDZLukbRN0kVl1h8p6SZJt0m6U9LpResuDve7R9JpSabTNcdENs9krtAVAQMqX7w8h1F/1cynXirKDXbL9zOOegSMsu0Ew3kzLgdeStBLfKOkE0o2exfB1K0nEsz5/clw3xPC508CNgCfDI/nOki39PKOVAoYUQ7Dm9XWz0LGk/IiqYPF6el943zLzOzZc+x6ErDNzO41sxxwNXBmyTYGLA0fLwOiAQ7PBK42s6yZ/R7YFh7PdZBuu4OrdPGazOUZ7O+ht8erBetldDjFzokc04WZqvdNZ7L09Yjli/oTSFl7mjNgSBqUtBIYkbRC0srw72jgsBjHPhx4oOj59nBZsUuA10jaTjD39wVV7Bul8zxJWyRtSafTMZLlWkW33cFVKh6ZyOW9SW2dRd+tx8bn7jA5l3Qmy8iSFD0ewGfNl8N4A0FHvSeG/6O//yIoaqqk3LtcWkG+EbjSzNYApwNfkdQTc99godkVZrbezNaPjo7GSJZrFdHFs1tGAl29dJA9+6aZmi6UXT+RLbDYO+3VVdRku5Z6jPR41pvUlpjzdsbMPgF8QtIFZvavNRx7O3BE0fM17C9yipxDUEeBmd0iaRAYibmva3PpTJbeHrFicXeMBBrV1ewYz7JmxcGTVk5kPYdRb/t72E8RlHrHN7Y3y6HLvIVUsTiV3o9IGgaQ9C5J/1k8A988NgPrJK2VNEBQib2pZJv7CQY2RNLxBMOnp8PtzpKUkrQWWIcPR9Jxgiz/QNdk+Sv1PJ7MFbyFVJ0tpLd3t4xCUI04AeMfzSwj6RTgNOBLwKcq7WRmeeB84AbgboLWUFslXSopGh79QuBcSXcAVwFnW2ArcA1wF3A98GYzK5+Pd21rLDPVVT/ISheviVze+2DU2Ug4j0W1AaMwYzzmAeMgcb6d0YX6T4BPmdl/SbokzsHN7DqCyuziZe8uenwX8Nw59v0A8IE4r+PaU3q8u8bpqTQY3mS20DX1OY2S6utl2aL+qpvW7pzIMWPd0yAjrjg5jAclfQb4C+A6SamY+zk3r24ZFiSycmgAqUIOw+sw6q6W3t7d1kcorjgX/r8gKFbaYGa7gZXA2xNNlet4MzPGjvFcVwWM/t4eVi4emL8Ow1tJ1V0tvb27rY9QXBUDhplNEkyadEq4KA/8NslEuc63azJHYca67g5uvrtdbyWVjFp6e3dbH6G44s7p/Q7g4nBRP/DVJBPlOl/0A169tLuaLY4Op8oOt50vzJDNz/iwIAlYPZxibG8Ws/jjpHbLXPPVilMk9XLgDGACwMweAoaTTJTrfN16BzdXDmNy2mfbS8rocIp904XZ+UbiSGeyLEn1eQAvESdg5CwIzQYgaSjZJLluMLa3OysVo+KR0rvdSR/aPDG19MXotgYZccX5dl4TtpJaLulc4PUE83s7V7NurVQcXZIil5/hNZ//GT3a32ExGi7EO+7VX/Qde+vXb2d4MF5A/tWDe1i32gtSSsV590aBa4G9wHHAu4EXJ5ko1/nSmSyL+nu77o76eetGOWnto0yWKR45+ZhVPOPIFU1IVWd76uHLed66EcazecbDedMrOXpkiJc/o+x4p10tzq/1VDN7B/CDaIGkjxJUhDtXk3SmOwd2O+5xw1zzhpObnYyusmxxP18551nNTkZHmDNgSPpb4E3AMZLuLFo1DNycdMJcZ0tnuquXt3OdYL4cxteA7wH/BBRPr5oxs52Jpsp1vPR4liccsqTZyXDOVWG+4c33AHsI5qxwrq7SmSzPPXZVs5PhnKuCjwnlGi6bL7Bn33TXtZByrt15wHAN162d9pxrdx4wXMN5wHCuPXnAcA23f+jo7hpHyrl2l2jAkLRB0j2Stkm6qMz6yyTdHv79RtLuonUflrRV0t2S/kVSd8zj2QW6tZe3c+0usW62knqBy4FTge3AZkmbwln2ADCztxZtfwFwYvj4OQQz8T01XP0T4PnA/ySVXtc46UwWCVaF02c659pDkjmMk4BtZnavmeWAq4Ez59l+I8G83hAMdDgIDAApgiHVH00wra6B0pksKxcP0N/rJaLOtZMkf7GHAw8UPd8eLjuIpKOAtcAPAczsFuAm4OHw7wYzu3uOfc+TtEXSlnQ6Xcfku6T4SKDOtackA0a5Ooe5ZjA5C7jWzAoAkh4PHA+sIQgyL5T0R+V2NLMrzGy9ma0fHR2tQ7Jd0sY8YDjXlpIMGNuBI4qerwEemmPbs9hfHAXBpE0/NbNxMxsnGKLk2Ymk0jWcjyPlXHtKMmBsBtZJWitpgCAobCrdSNJxwArglqLF9wPPl9QnqZ+gwrtskZRrL2ZGetxzGM61o8QChpnlgfOBGwgu9teY2VZJl0o6o2jTjcDVduAUZNcCvwN+CdwB3GFm304qra5x9k7lyeVnPGA414YSnb3GzK4DritZ9u6S55eU2a8AvCHJtLnm8F7ezrUvb9foGmp/L28PGM61Gw8YrqGiXt7dONuec+3OA4ZrKB9Hyrn25QHDNdRYZoqB3h6WLkq0+sw5lwAPGK6hol7ePpakc+3HA4ZrqHQmy4i3kHKuLXnAcA3lvbyda18eMFxD7fBe3s61LQ8YrmHyhRkem8ix2gOGc23JA4ZrmJ0TOcy8l7dz7coDhmuYMR8WxLm25gHDNYyPI+Vce/OA4RrGx5Fyrr15wHANE40j5TkM59qTBwzXMOlMluHBPgb7e5udFOdcDTxguIZJ+1zezrW1RAOGpA2S7pG0TdJFZdZfJun28O83knYXrTtS0vcl3S3pLklHJ5lWl7x0Jut9MJxrY4kNGSqpF7gcOBXYDmyWtMnM7oq2MbO3Fm1/AXBi0SG+DHzAzH4gaQkwk1RaXWOMZaZ4yprlzU6Gc65GSeYwTgK2mdm9ZpYDrgbOnGf7jcBVAJJOAPrM7AcAZjZuZpMJptU1gI8j5Vx7SzJgHA48UPR8e7jsIJKOAtYCPwwXPQHYLek/Jd0m6Z/DHEu5fc+TtEXSlnQ6Xcfku3qayOaZyBW8DsO5NpZkwCg34YHNse1ZwLVmVgif9wHPA94G/C/gGODscjua2RVmtt7M1o+Oji4sxS4xO7xJrXNtL8mAsR04ouj5GuChObY9i7A4qmjf28LirDzwLeAZiaTSNYT38nau/SUZMDYD6yStlTRAEBQ2lW4k6ThgBXBLyb4rJEVZhhcCd5Xu69qH9/J2rv0lFjDCnMH5wA3A3cA1ZrZV0qWSzijadCNwtZlZ0b4FguKoGyX9kqB467NJpdUlL+rlvXqpBwzn2lVizWoBzOw64LqSZe8ueX7JHPv+AHhqYolzDZXOZOntESsWDzQ7Kc65GnlPb9cQY3uzrBoaoLenXFsI51w78IDhGiLtU7M61/Y8YLiG8HGknGt/HjBcQ3gvb+fanwcMl7iZGWOHF0k51/Y8YLjE7d43TX7GPGA41+Y8YLjERZ32Vg8PNjklzrmF8IDhEjeWmQJ8WBDn2p0HDJc4H0fKuc7gAcMlzgOGc53BA4ZLXDqTZVF/L0MDZac0cc61CQ8YLnFRL2/JhwVxrp15wHCJ817eznUGDxgucd7L27nO4AHDJS49nvV5MJzrAB4wXKKy+QK7J6c9h+FcB0g0YEjaIOkeSdskXVRm/WWSbg//fiNpd8n6pZIelPRvSabTJWfHeA7wJrXOdYLEZtyT1AtcDpwKbAc2S9pkZrNzc5vZW4u2vwA4seQw7wN+lFQaXfK8D4ZznSPJHMZJwDYzu9fMcsDVwJnzbL8RuCp6IumZwCHA9xNMo0uYBwznOkeSAeNw4IGi59vDZQeRdBSwFvhh+LwH+Cjw9kovIuk8SVskbUmn0wtOtKsvDxjOdY4kA0a5Xlo2x7ZnAdeaWSF8/ibgOjN7YI7t9x/Q7AozW29m60dHR2tMqktKFDBWDXnAcK7dJVaHQZCjOKLo+RrgoTm2PQt4c9Hzk4HnSXoTsAQYkDRuZgdVnLvWlh6fYuXQAAN93iDPuXaXZMDYDKyTtBZ4kCAo/GXpRpKOA1YAt0TLzOzVRevPBtYnGSze++2t/OS3O5I6fFd7ZM8Uhy73eTCc6wSJBQwzy0s6H7gB6AW+YGZbJV0KbDGzTeGmG4GrzWyu4qrEHbpskHWHLGnWy3e0dYcs4UVPPKTZyXDO1YGaeJ2uu/Xr19uWLVuanQznnGsrkm41s/WVtvOCZeecc7F4wHDOOReLBwznnHOxeMBwzjkXiwcM55xzsXjAcM45F4sHDOecc7F4wHDOORdLR3Xck5QG/lDj7iNAN44P4ufdXbr1vKF7zz3OeR9lZhVHb+2ogLEQkrbE6enYafy8u0u3njd077nX87y9SMo551wsHjCcc87F4gFjvyuanYAm8fPuLt163tC951638/Y6DOecc7F4DsM551wsHjCcc87F4gEDkLRB0j2StknqqHnDJR0h6SZJd0vaKukt4fKVkn4g6bfh/xXhckn6l/C9uFPSM5p7BrWT1CvpNknfCZ+vlfSz8Jy/LmkgXJ4Kn28L1x/dzHQvlKTlkq6V9Ovwcz+5Sz7vt4bf8V9JukrSYCd+5pK+IGlM0q+KllX9+Ur663D730r66ziv3fUBQ1IvcDnwUuAEYKOkE5qbqrrKAxea2fHAs4E3h+d3EXCjma0DbgyfQ/A+rAv/zgM+1fgk181bgLuLnn8IuCw8513AOeHyc4BdZvZ44LJwu3b2CeB6M3si8DSC96CjP29JhwN/B6w3sycTTAt9Fp35mV8JbChZVtXnK2kl8B7gWcBJwHuiIDMvM+vqP+Bk4Iai5xcDFzc7XQme738BpwL3AIeGyw4F7gkffwbYWLT97Hbt9AesCX84LwS+A4igt2tf6edOMO/8yeHjvnA7NfscajzvpcDvS9PfBZ/34cADwMrwM/wOcFqnfubA0cCvav18gY3AZ4qWH7DdXH9dn8Ng/xctsj1c1nHCbPeJwM+AQ8zsYYDw/+pws055Pz4O/AMwEz5fBew2s3z4vPi8Zs85XL8n3L4dHQOkgS+GxXGfkzREh3/eZvYg8BHgfuBhgs/wVrrjM4fqP9+aPncPGMGdZ6mOa2ssaQnwH8Dfm9ne+TYts6yt3g9J/xsYM7NbixeX2dRirGs3fcAzgE+Z2YnABPuLJ8rpiHMPi1POBNYChwFDBMUxpTrxM5/PXOdZ0/l7wAgi6xFFz9cADzUpLYmQ1E8QLP7dzP4zXPyopEPD9YcCY+HyTng/ngucIek+4GqCYqmPA8sl9YXbFJ/X7DmH65cBOxuZ4DraDmw3s5+Fz68lCCCd/HkDvBj4vZmlzWwa+E/gOXTHZw7Vf741fe4eMGAzsC5sTTFAUFG2qclpqhtJAj4P3G1mHytatQmIWkb8NUHdRrT8tWHrimcDe6Ksbrsws4vNbI2ZHU3wef7QzF4N3AS8Itys9Jyj9+IV4fZtebdpZo8AD0g6Llz0IuAuOvjzDt0PPFvS4vA7H513x3/moWo/3xuAl0haEebOXhIum1+zK29a4Q84HfgN8Dvgnc1OT53P7RSCrOadwO3h3+kE5bU3Ar8N/68MtxdBq7HfAb8kaHXS9PNYwPm/APhO+PgY4OfANuAbQCpcPhg+3xauP6bZ6V7gOT8d2BJ+5t8CVnTD5w28F/g18CvgK0CqEz9z4CqCepppgpzCObV8vsDrw/PfBrwuzmv70CDOOedi8SIp55xzsXjAcM45F4sHDOecc7F4wHDOOReLBwznnHOxeMBwrgJJl0h6W63rw21eFmdQy/BYk5JWFy0bry7FziXDA4ZzjfEygtGQ49gBXJhgWpyriQcM58qQ9E4Fc6T8N3BcuOxYSddLulXS/5P0xDL7HbSNpOcAZwD/LOn2cJv5jvUF4FXhENTFxx6S9F1Jd4RzPrwqwbfAuYP0Vd7Eue4i6ZkEQ4qcSPAb+QXByKdXAG80s99KehbwSYJxqoodtI2ZvVDSJoIe59eGr3HjPMcaJwgabyGYsyCyAXjIzP4kPMayep+7c/PxgOHcwZ4HfNPMJgHCi/0gwWB23wiGKgKCoSdmhSMCz7tNFdv9C3C7pI8WLfsl8BFJHyIIPv+vprNzrkYeMJwrr3TMnB6CuRWePs8+cbaJtZ2Z7Zb0NeBNRct+E+Z+Tgf+SdL3zezSCq/lXN14HYZzB/sx8HJJiyQNA38KTAK/l/RKmJ0r+WnFO1kwz8hc22SA4RjbFfsY8AbCGztJhwGTZvZVgsmC2nb+bdeePGA4V8LMfgF8nWBk3/8AoqKfVwPnSLoD2EowYU+puba5Gnh7OAvesXGOZWY7gG+yv7jqKcDPJd0OvBN4/0LP1blq+Gi1zjnnYvEchnPOuVg8YDjnnIvFA4ZzzrlYPGA455yLxQOGc865WDxgOOeci8UDhnPOuVj+f3qgZOlCdym+AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "finished\n"
     ]
    }
   ],
   "source": [
    "print (\"start\")\n",
    "\n",
    "# 文本预处理\n",
    "folder_path = './Database/SogouC/Sample'\n",
    "\n",
    "all_words_list, train_data_list, train_class_list, test_data_list, test_class_list = text_processing(folder_path, test_rate=0.2)\n",
    "\n",
    "# 生成stopwords_set\n",
    "stopwords_file = './stopwords_cn.txt'\n",
    "stopwords_set = make_word_set(stopwords_file)\n",
    "\n",
    "# 文本特征提取和分类\n",
    "flag = 'sklearn'\n",
    "deleteNs = range(0, 1000, 20)\n",
    "test_accuracy_list = []\n",
    "for deleteN in deleteNs:\n",
    "    feature_words = vocab_select(all_words_list, deleteN, stopwords_set)\n",
    "    train_feature_list, test_feature_list = text_features(train_data_list, test_data_list, feature_words, flag)\n",
    "    test_accuracy = text_classifier(train_feature_list, test_feature_list, train_class_list, test_class_list, flag)\n",
    "    test_accuracy_list.append(test_accuracy)\n",
    "print(test_accuracy_list)\n",
    "\n",
    "# 结果评价\n",
    "plt.figure()\n",
    "plt.plot(deleteNs, test_accuracy_list)\n",
    "plt.title('Relationship of deleteNs and test_accuracy')\n",
    "plt.xlabel('deleteNs')\n",
    "plt.ylabel('test_accuracy')\n",
    "plt.show()\n",
    "#plt.savefig('result.png')\n",
    "\n",
    "print (\"finished\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 不同贝叶斯分类的对比"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 分类\n",
    "def text_classifier_by_sklearn(train_feature_list, test_feature_list, \n",
    "                    train_label_list, test_label_list, sklearn_class_func=MultinomialNB):\n",
    "    # sklearn分类器\n",
    "    classifier = sklearn_class_func().fit(train_feature_list, train_label_list)\n",
    "    # MultinomialNB()的使用方法和参数见：https://www.cnblogs.com/pinard/p/6074222.html\n",
    "    test_accuracy = classifier.score(test_feature_list, test_label_list)\n",
    "    return test_accuracy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "总样本数： 90\n",
      "MultinomialNB Accu: 0.6842105263157895\n",
      "GaussianNB Accu: 0.631578947368421\n",
      "ComplementNB Accu: 0.6842105263157895\n",
      "BernoulliNB Accu: 0.6842105263157895\n"
     ]
    }
   ],
   "source": [
    "\n",
    "# 文本预处理\n",
    "folder_path = './Database/SogouC/Sample'\n",
    "\n",
    "all_words_list, train_data_list, train_class_list, test_data_list, test_class_list = text_processing(folder_path, test_rate=0.2)\n",
    "\n",
    "# 生成stopwords_set\n",
    "stopwords_file = './stopwords_cn.txt'\n",
    "stopwords_set = make_word_set(stopwords_file)\n",
    "\n",
    "deleteN = 20\n",
    "feature_words = vocab_select(all_words_list, deleteN, stopwords_set)\n",
    "train_feature_list, test_feature_list = text_features(train_data_list, test_data_list, feature_words, flag)\n",
    "\n",
    "name_func_dict = {\n",
    "    \"MultinomialNB\": MultinomialNB,\n",
    "    \"GaussianNB\": GaussianNB, \n",
    "    \"ComplementNB\": ComplementNB, \n",
    "    \"BernoulliNB\": ComplementNB,\n",
    "}\n",
    "\n",
    "for func_name, func in name_func_dict.items():\n",
    "    test_accuracy = text_classifier_by_sklearn(train_feature_list, test_feature_list, \n",
    "                                train_class_list, test_class_list, sklearn_class_func=func)\n",
    "    print(\"{} Accu: {}\".format(func_name, test_accuracy))   \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
